
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img
    src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png"
    alt="Databricks Learning"
  >
</div>


## Workflow Notebook - Silver to Feature Store

1. **Widgets at the Top**:
   - In this notebook, you will find several parameterized widgets:
     - **catalog**
     - **column**
     - **primary_key**
     - **schema**
     - **silver_table_name**
     - **target_column**

2. **Purpose of Parameterization**:
   - These widgets allow you to configure parameters dynamically when setting up workflows.
   - Instead of modifying hard-coded values in the notebook, you can edit the parameters directly in the Databricks Workflows UI.

3. **Notebook Functionality**:
   - This notebook focuses on **feature engineering**.
   - Specifically, it normalizes the **Age** column and generates a feature table.
   - The resulting feature table is stored in the **Feature Store** for use in downstream tasks like model training or evaluation.

Read in silver-layer data.

In [0]:
catalog = dbutils.widgets.get(<FILL_IN>)
schema = dbutils.widgets.get(<FILL_IN>)
spark.sql(f"USE {catalog}.{schema}")

In [0]:
%skip
catalog = dbutils.widgets.get("catalog")
schema = dbutils.widgets.get("schema")
spark.sql(f"USE {catalog}.{schema}")

In [0]:
silver_table_name = dbutils.widgets.get(<FILL_IN>)
df = spark.read.format('delta').table(<FILL_IN>).toPandas()

In [0]:
%skip
silver_table_name = dbutils.widgets.get("silver_table_name")
df = spark.read.format('delta').table(silver_table_name).select('id', 'Diabetes_binary', 'HighBP', 'BMI', 'Smoker', 'Stroke', 'HeartDiseaseorAttack', 'Age'). toPandas()

Perform feature engineering - normalize your column of choice.

In [0]:
import pandas as pd
import numpy as np

from databricks.feature_engineering import FeatureEngineeringClient

## Instantiate the FeatureEngineeringClient
fe = FeatureEngineeringClient()

## Normalize the Age column and store it as Age_normalized

column = dbutils.widgets.get(<FILL_IN>)
target_column = dbutils.widgets.get(<FILL_IN>)

df[f'{column}_normalized'] = <FILL_IN>


df = df.drop(target_column, axis=1)
df = df.drop(column, axis=1)
normalized_df = spark.createDataFrame(df)

primary_key = dbutils.widgets.get(<FILL_IN>)

## Set the feature table name for storage in UC
feature_table_name = f'{<FILL_IN>}.{<FILL_IN>}.{<FILL_IN>}_features'

## print(f"The name of the feature table: {feature_table_name}\n\n")

spark.sql(f'drop table if exists {feature_table_name}')

## Create the feature table
fe.create_table(
    <FILL_IN>
)

In [0]:
%skip

import pandas as pd
import numpy as np

from databricks.feature_engineering import FeatureEngineeringClient

## Instantiate the FeatureEngineeringClient
fe = FeatureEngineeringClient()

## Normalize the Age column and store it as Age_normalized

column = dbutils.widgets.get("column")
target_column = dbutils.widgets.get("target_column")

df[f'{column}_normalized'] = (df[column] - df[column].mean()) / df[column].std()


df = df.drop(target_column, axis=1)
df = df.drop(column, axis=1)
normalized_df = spark.createDataFrame(df)

primary_key = dbutils.widgets.get("primary_key")

## Set the feature table name for storage in UC
feature_table_name = f'{catalog}.{schema}.{silver_table_name}_features'

## print(f"The name of the feature table: {feature_table_name}\n\n")

spark.sql(f'drop table if exists {feature_table_name}')

## Create the feature table
fe.create_table(
    name = feature_table_name,
    primary_keys = primary_key,
    df = normalized_df, 
    description="{schema} quality features", 
    tags = {"source": "silver", "format": "delta"}
)

&copy; 2026 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br/><br/><a href="https://databricks.com/privacy-policy" target="_blank">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use" target="_blank">Terms of Use</a> | <a href="https://help.databricks.com/" target="_blank">Support</a>