
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img
    src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png"
    alt="Databricks Learning"
  >
</div>



# LAB - Real-time Deployment with Model Serving

In this lab, you will deploy ML models with Databricks Model Serving using **offline feature tables** (Delta in Unity Catalog). This lab includes **two** sections.

In the first section, you will deploy a model for real-time inference with Model Serving's **UI**. This section will demonstrate the most basic and simple way of deploying models with Model Serving.

In the second section, you will deploy a model **programmatically using the Databricks SDK (API)**.

For both sections, data preparation, model fitting and model registration are already done for you! You just need to focus on the deployment part.

**Lab Outline:**

* Simple real-time deployment
  - **Task 1:** Serve the model using the UI
  - **Task 2:** Query the endpoint

* API-based real-time deployment 
  - **Task 3:** Create an offline feature table
  - **Task 4:** Create a derived feature using a SQL function
  - **Task 5:** Prepare the feature table for inference
  - **Task 6:** (Optional) Define features with FeatureLookup/FeatureFunction for illustration
  - **Task 7:** Create training set and fit the model (offline join)
  - **Task 8:** Deploy the model
  - **Task 9:** Query the endpoint

## REQUIRED - SELECT CLASSIC COMPUTE
Before executing cells in this notebook, please select your classic compute cluster in the lab. Be aware that **Serverless** is enabled by default.

Follow these steps to select the classic compute cluster:
1. Navigate to the top-right of this notebook and click the drop-down menu to select your cluster. By default, the notebook will use **Serverless**.

2. If your cluster is available, select it and continue to the next cell. If the cluster is not shown:

   - Click **More** in the drop-down.

   - In the **Attach to an existing compute resource** window, use the first drop-down to select your unique cluster.

**NOTE:** If your cluster has terminated, you might need to restart it in order to select it. To do this:

1. Right-click on **Compute** in the left navigation pane and select *Open in new tab*.

2. Find the triangle icon to the right of your compute cluster name and click it.

3. Wait a few minutes for the cluster to start.

4. Once the cluster is running, complete the steps above to select your cluster.

## Requirements

Please review the following requirements before starting the lesson:

* To run this notebook, you need to use one of the following Databricks runtime(s): **17.3.x-cpu-ml-scala2.13**


## Classroom Setup

Before starting the lab, run the provided classroom setup scripts. 

**📌 Note:** In this lab you will be using the Databricks SDK to create Model Serving endpoint. Therefore, you will need to run the next code block to **install `databricks-sdk`**. 

Before starting the lab, run the provided classroom setup script. This script will define configuration variables necessary for the lab. Execute the following cell:

In [0]:
%pip install -U -qq databricks-sdk databricks-feature-engineering==0.12.1

dbutils.library.restartPython()

In [0]:
%run ../Includes/Classroom-Setup-4.3

**Other Conventions:**

Throughout this lab, we'll refer to the object `DA`. This object, provided by Databricks Academy, contains variables such as your username, catalog name, schema name, working directory, and dataset locations. Run the code block below to view these details:

In [0]:
print(f"Username:          {DA.username}")
print(f"Catalog Name:      {DA.catalog_name}")
print(f"Schema Name:       {DA.schema_name}")
print(f"Working Directory: {DA.paths.working_dir}")
print(f"Dataset Location:  {DA.paths.datasets}")

## Data and Model Preparation

Before you start the deployment process, you will need to fit and register a model. In this section, you will load dataset, fit a model and register it with UC.

**Note:** All necessary code is provided, which means you don't need to complete anything in this section.

### Load Dataset

In [0]:
from pyspark.sql.functions import col, monotonically_increasing_id

## Set the path of the dataset
shared_volume_name = 'cdc-diabetes' # From Marketplace
csv_name = 'diabetes_binary_5050split_BRFSS2015' # CSV file name
dataset_path = f"{DA.paths.datasets.cdc_diabetes}/{shared_volume_name}/{csv_name}.csv" # Full path


df = spark.read.csv(dataset_path, inferSchema=True, header=True, multiLine=True, escape='"')\
    .na.drop(how='any')

df = df.withColumn("uniqueID", monotonically_increasing_id())   # Add unique_id column

## Dataset specs
primary_key = "uniqueID"
response = "Diabetes_binary"

## Separate features and ground-truth
features_df = df.drop(response)
response_df = df.select(primary_key, response)

## Convert data to pandas dataframes
X_train_pdf = features_df.drop(primary_key).toPandas()
Y_train_pdf = response_df.drop(primary_key).toPandas()

### Setup Model Registry with UC

Before we start model deployment, we need to fit and register a model. In this lab, **we will log models to Unity Catalog**, which means first we need to setup the **MLflow Model Registry URI**.

In [0]:
import mlflow

## Point to UC model registry
mlflow.set_registry_uri("databricks-uc")
client = mlflow.MlflowClient()

### Helper Class for Model Creation

In [0]:
import time
import warnings
from mlflow.types.utils import _infer_schema
from mlflow.models import infer_signature
from sklearn.tree import DecisionTreeClassifier
from databricks.feature_engineering import FeatureEngineeringClient

model_name = f"{DA.catalog_name}.{DA.schema_name}.ml_diabetes_model" ## Use 3-level namespace

def get_latest_model_version(model_name):
    """Helper function to get the latest model version as a string"""
    model_version_infos = client.search_model_versions("name = '%s'" % model_name)
    model_version_list = [model_version_info.version for model_version_info in model_version_infos]
    ## Convert to integers for correct numeric comparison
    model_version_int_list = list(map(int, model_version_list))
    ## Find the maximum and convert back to a string
    return str(max(model_version_int_list))

def fit_and_register_model(X, Y, model_name_=model_name, random_state_=42, model_alias=None, log_with_fs=False, training_set_spec_=None):
    """Helper function to train and register a decision tree model"""

    clf = DecisionTreeClassifier(random_state=random_state_)
    with mlflow.start_run(run_name="LAB4-Real-Time-Deployment") as mlflow_run:

        ## Enable automatic logging of input samples, metrics, parameters, and models
        mlflow.sklearn.autolog(
            log_input_examples=True,
            log_models=False,
            log_post_training_metrics=True,
            silent=True)
        
        clf.fit(X, Y)

        ## Log model and push to registry
        if log_with_fs:
            # Infer output schema
            try:
                output_schema = _infer_schema(Y)
            except Exception as e:
                warnings.warn(f"Could not infer model output schema: {e}")
                output_schema = None
            
            ## Log using feature engineering client and push to registry
            fe = FeatureEngineeringClient()
            fe.log_model(
                model = clf,
                artifact_path = "decision_tree",
                flavor = mlflow.sklearn,
                training_set = training_set_spec_,
                output_schema = output_schema,
                registered_model_name = model_name_
            )
        
        else:
            signature = infer_signature(X, Y)
            example = X[:3]
            mlflow.sklearn.log_model(
                clf,
                artifact_path = "decision_tree",
                signature = signature,
                input_example = example,
                registered_model_name = model_name_
            )

        ## Set model alias
        if model_alias:
            time.sleep(10) ## Wait 10secs for model version to be created
            client.set_registered_model_alias(model_name_, model_alias, get_latest_model_version(model_name_))

    return clf

### Fit and Register the Model

Before we start model deployment process, we will **fit and register a model**. The model's alias will be set to `Production` and it will be served with Databricks Model Serving in the next step.

In [0]:
model = fit_and_register_model(X_train_pdf, Y_train_pdf, model_name, 42, "Production")

## Simple Real-time Model Deployment

Now that the model is registered and ready for deployment, the next step is to create a serving endpoint with Model Serving and serve the model.

### Task 1: Serve the Model Using the UI

Serve the **"Production"** model that we registered in the previous section using the following endpoint configuration.

**Configuration:**

* Name: `la4-1-diabetes-model`

* Compute Size: `small` (CPU)

* Autoscaling: `Scale to zero`

* Tags: Define tags that might be meaningful for this deployment


**💡 Note:** Endpoint creation will take sometime. Therefore, you can work on the next section  while the endpoint is created for you.

### Task 2: Query the Endpoint 

Test the model deployment using the **Query endpoint** feature in browsers. Use the provided **Example request** payload to use the model for inference.

## Real-time Model Deployment with Databricks Model Serving

In this section, you will deploy a model using **Databricks Model Serving** with an **offline feature table** stored in Unity Catalog.  
Unlike the previous section where you deployed through the **UI**, this time you will create and configure the serving endpoint programmatically using the **Databricks SDK (API)**.

First, you will review the registered model that was trained and logged using features from a Delta table in Unity Catalog.  
Then, you will deploy this model as a real-time serving endpoint using the API. Finally, you will query the endpoint to perform live inference using sample data records.

This workflow demonstrates how to automate model deployment with Databricks Model Serving using **offline feature tables**, providing a foundation for scalable and reproducible production ML workflows.

### Task 3: Create an Offline Feature Table

Let's create an **offline feature table** to store the features that will be used for model training and batch or real-time inference.  

For this task, you will set up the feature table as follows:

- The feature table will include **all feature columns** from the dataset  
- Define a **primary key** for uniquely identifying each record  
- Add a **description** for the table in Unity Catalog  

This table will be stored as a **Delta table** in Unity Catalog and can later be accessed directly by Model Serving for inference.

In [0]:
from databricks.feature_engineering import FeatureEngineeringClient

## Define feature table name and initialize Feature Engineering client
feature_table_name = f"{DA.catalog_name}.{DA.schema_name}.diabetes_features"
fe = FeatureEngineeringClient()

## Create the offline feature table
fe.<FILL_IN>

In [0]:
%skip

from databricks.feature_engineering import FeatureEngineeringClient

## Define the feature table name in Unity Catalog
feature_table_name = f"{DA.catalog_name}.{DA.schema_name}.diabetes_features"

## Initialize Feature Engineering client
fe = FeatureEngineeringClient()

## Create the offline feature table
fe.create_table(
    name=feature_table_name,
    df=features_df,
    primary_keys=[primary_key],
    description="Offline feature table containing diabetes dataset features for model training and inference"
)

### Task 4: Create a Derived Feature Using SQL Function

In this task, you will create a **derived feature** based on existing columns in the dataset.  
Instead of directly using **Education** and **Income**, you will compute a new field called  
**Education-Adjusted Income Index (EAI)** that represents a weighted interaction between the two.

This field is calculated using the formula:  
**`Education-Adjusted Income = Income Ã— Education`**

*Note:* In real-world scenarios, correlated features such as income and education should be carefully examined for redundancy or multicollinearity. However, here the goal is to demonstrate how to define and register a simple SQL function in Unity Catalog that can be referenced during data processing or model training.

The function should be structured as follows, using the variable names defined below:  
- **Function name:** `eai_function`  
- **Input:** `Income`, `Education`  
- **Output:** `eai`


In [0]:
## Create or replace a SQL function to compute the derived feature
spark.sql(<FILL_IN>)

In [0]:
%skip
## Create or replace a SQL function to compute the derived feature
spark.sql(f"""
CREATE OR REPLACE FUNCTION eai_function (Income DOUBLE, Education DOUBLE)
RETURNS DOUBLE
LANGUAGE PYTHON AS
$$
eai = Income * Education
return eai
$$
""")

### Task 5: Prepare the Feature Table for Inference

In this task, you will make sure that the **offline feature table** you created earlier can be used directly for **model inference**.  
This step ensures that the feature table is properly configured in Unity Catalog and that change tracking is enabled for incremental updates.

**Perform the following steps:**

* Enable **Change Data Feed (CDF)** on the feature table to allow incremental updates and lineage tracking  
* Verify that the feature table is registered in Unity Catalog and available for use by Model Serving  

The resulting table will remain an **offline Delta table**, suitable for both batch and real-time inference through Model Serving.

In [0]:
from pprint import pprint
from databricks.sdk import WorkspaceClient

## Initialize the Workspace client
workspace = WorkspaceClient()

## Enable Change Data Feed (CDF) on the offline feature table
spark.sql(<FILL_IN>)

## Retrieve and display table details from Unity Catalog
feature_table_details = <FILL_IN>

pprint(feature_table_details)

In [0]:
%skip
from pprint import pprint
from databricks.sdk import WorkspaceClient

## Initialize the Workspace client
workspace = WorkspaceClient()

## Enable Change Data Feed (CDF) on the offline feature table
spark.sql(f"""ALTER TABLE {feature_table_name} SET TBLPROPERTIES (delta.enableChangeDataFeed = true)""")

## Retrieve and display table details from Unity Catalog
feature_table_details = workspace.tables.get(feature_table_name)

pprint(feature_table_details)

### Task 6: Define Features

Now that you have an **offline feature table** and a registered SQL function, you will combine them so the model can use both the stored features and the derived feature during **training**.

In [0]:
from databricks.feature_engineering import FeatureLookup, FeatureFunction
## Define combined feature lookup (offline table) and derived feature function
features=[FILL_IN]

In [0]:
%skip
from databricks.feature_engineering import FeatureLookup, FeatureFunction

## Define combined feature lookup (offline table) and derived feature function
features = [
    FeatureLookup(
        table_name=feature_table_name,
        lookup_key=primary_key
    ),
    FeatureFunction(
        udf_name="eai_function",
        output_name="eai",
        input_bindings={
            "Education": "Education",
            "Income": "Income"
        },
    ),
]

### Task 7: Create Training Set and Fit the Model

Now that all feature configuration is set and ready, create training set and fit the model.

In [0]:
from pyspark.sql import functions as F

## Build an offline training dataframe (join features + label; compute derived feature offline)
training_df_offline = (
    <FILL_IN>
)

## Convert to pandas
X_train_pdf2 = training_df_offline.drop(primary_key, response).toPandas()
Y_train_pdf2 = training_df_offline.select(response).toPandas()

## Fit and register the model (OFFLINE: no FS metadata)
model_name_2 = f"{DA.catalog_name}.{DA.schema_name}.ml_diabetes_model_fe"
model_fe = fit_and_register_model(
    X_train_pdf2,
    Y_train_pdf2,
    model_name_2,
    20,
    log_with_fs=False,          
    training_set_spec_=None 
)

In [0]:
%skip

from pyspark.sql import functions as F

## Build an offline training dataframe (join features + label; compute derived feature offline)
training_df_offline = (
    features_df.join(response_df, on=primary_key, how="inner")
               .withColumn("eai", F.col("Income") * F.col("Education"))
)

## Convert to pandas
X_train_pdf2 = training_df_offline.drop(primary_key, response).toPandas()
Y_train_pdf2 = training_df_offline.select(response).toPandas()

## Fit and register the model (OFFLINE: no FS metadata)
model_name_2 = f"{DA.catalog_name}.{DA.schema_name}.ml_diabetes_model_fe"
model_fe = fit_and_register_model(
    X_train_pdf2,
    Y_train_pdf2,
    model_name_2,
    20,
    log_with_fs=False,          
    training_set_spec_=None 
)

### Task 8: Deploy the Model (Offline Features)

Create a serving endpoint with the following configuration:

* Autoscaling: `Scale-to-zero`
* Compute size: `Small`

**💡 Note:** Endpoint creation will take some time. Please wait while the endpoint is created.

In [0]:
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import EndpointCoreConfigInput, EndpointTag

## Initialize Workspace client
w = WorkspaceClient()

## Get the model version that will be served (from the offline-logged model)
model_version = <FILL_IN>

## Define the endpoint configuration
endpoint_config_dict = {
    "served_models": [
        {
            <FILL_IN>
        }
    ]
}
endpoint_config = <FILL_IN>

## Define endpoint name
endpoint_name = <FILL_IN>

## Create the serving endpoint
try:
    w.<FILL_IN>(
        name=<FILL_IN>,
        config=<FILL_IN>,
        tags=[EndpointTag.from_dict({"key": "db_academy", "value": "lab4_serve_offline_model"})]
    )
    print(f"Creating endpoint {endpoint_name} with model {model_name_2} version {model_version}")
except Exception as e:
    if "already exists" in e.args[0]:
        print(f"Endpoint with name {endpoint_name} already exists")
    else:
        raise e

In [0]:
%skip
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import EndpointCoreConfigInput, EndpointTag

w = WorkspaceClient()

## Get the model version that will be served (from the offline-logged model above)
model_version = get_latest_model_version(model_name_2)

endpoint_config_dict = {
    "served_models": [
        {
            "model_name": model_name_2,
            "model_version": model_version,
            "scale_to_zero_enabled": True,
            "workload_size": "Small"
        }
    ]
}
endpoint_config = EndpointCoreConfigInput.from_dict(endpoint_config_dict)

endpoint_name = f"ML_AS_03_Lab4_{DA.unique_name('_')}"

try:
    w.serving_endpoints.create_and_wait(
        name=endpoint_name,
        config=endpoint_config,
        tags=[EndpointTag.from_dict({"key": "db_academy", "value": "lab4_serve_offline_model"})]
    )
    print(f"Creating endpoint {endpoint_name} with model {model_name_2} version {model_version}")
except Exception as e:
    if "already exists" in e.args[0]:
        print(f"Endpoint with name {endpoint_name} already exists")
    else:
        raise e

### Task 9: Query the Endpoint

After the endpoint is created, it is time to test it. Use the following hard-coded test-sample to query the endpoint using the API.

In [0]:
# Sample a few records for testing
payload = X_train_pdf2.sample(3, random_state=42).to_dict(orient="records")

In [0]:
## Query the serving endpoint with test-sample
query_response = w.serving_endpoints.<FILL_IN>

print("Inference results:", query_response.predictions)

In [0]:
%skip
## Query the serving endpoint
query_response = w.serving_endpoints.query(
    name=endpoint_name,
    dataframe_records=payload
)

print("Inference results:", query_response.predictions)


## Conclusion

Great job completing this lab!  
In this lab, you successfully explored two key ways of deploying machine learning models with **Databricks Model Serving** using **offline feature tables**.

- In the **first section**, you deployed a model using the **UI**, demonstrating the simplest way to expose a registered model for real-time inference.  
- In the **second section**, you used the **Databricks SDK (API)** to automate model deployment. You created and prepared an offline feature table in Unity Catalog, defined a derived feature, trained and registered an offline model **without Feature Store metadata**, and deployed it to a real-time serving endpoint.  
- Finally, you tested your endpoint by sending complete feature vectors for live inference.

This workflow provides a foundation for building scalable, reproducible, and fully managed **real-time inference pipelines** using Databricks Model Serving with **offline Delta-based feature tables**.

&copy; 2026 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br/><br/><a href="https://databricks.com/privacy-policy" target="_blank">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use" target="_blank">Terms of Use</a> | <a href="https://help.databricks.com/" target="_blank">Support</a>