&nbsp;
&nbsp;
![](../_resources/images/e2eai-5.jpg)


# Generative AI with Databricks

## From Predictive to Prescriptive Maintenance
Manufacturers face labor shortages, supply chain disruptions, and rising costs, making efficient maintenance essential. Despite investments in maintenance programs, many struggle to boost asset productivity due to technician shortages and poor knowledge-sharing systems. This leads to knowledge loss and operational inefficiencies.

<div style="font-family: 'DM Sans';">
  <div style="width: 400px; color: #1b3139; margin-left: 50px; margin-right: 50px; float: left;">
    <div style="color: #ff5f46; font-size:50px;">73%</div>
    <div style="font-size:25px; margin-top: -20px; line-height: 30px;">
      of manufacturers struggle to recruit maintenance technicians ‚Äî McKinsey (2023)
    </div>
    <div style="color: #ff5f46; font-size:50px;">55%</div>
    <div style="font-size:25px; margin-top: -20px; line-height: 30px;">
      of manufacturers lack formal knowledge-sharing systems ‚Äî McKinsey (2023)
    </div>
  </div>
</div>

Generative AI can transform maintenance by reducing downtime and improving productivity. While predictive maintenance anticipates failures, Generative AI enables prescriptive maintenance. Using historical data, AI systems can identify issues, generate solutions, and assist technicians, allowing junior staff to perform effectively and freeing experts for complex tasks.
<br><br>

### From Models to Agent Systems
Generative AI is moving from standalone models to modular agent systems ([Zaharia et al., 2024](https://bair.berkeley.edu/blog/2024/02/18/compound-ai-systems/)). These systems integrate retrievers, models, prompts, and tools to handle complex tasks. Their modular design allows seamless upgrades (e.g., integrating a new LLM) and adaptation to changing needs.

<br>
<img style="float: right; margin-top: 10px;" width="700px" src="https://raw.githubusercontent.com/databricks-demos/dbdemos-resources/refs/heads/main/images/manufacturing/lakehouse-iot-turbine/team_flow_liza.png" />

<br>
<!div style="font-size: 19px; margin-left: 0px; clear: left; padding-top: 10px; ">

**Databricks empowers users with a Data + AI platform for Prescriptive Maintenance.** 
Let‚Äôs explore how to deploy this in production.
<br><br>
<div style="font-size: 19px; margin-left: 0px; clear: left; padding-top: 10px; ">
<img src="https://raw.githubusercontent.com/databricks-demos/dbdemos-resources/refs/heads/main/images/liza.png" style="width:80px">
<br>
<h3 style="padding: 10px 0px 0px 5px;">Liza, a Generative AI engineer, uses the Databricks Intelligence Platform to:</h3>
<ul style="list-style: none; padding: 0; margin-left: 05%;">
  <li style="margin-bottom: 10px; display: flex; align-items: center;">
    <div class="badge" style="height: 30px; width: 30px; border-radius: 50%; background: #fcba33; color: white; text-align: center; line-height: 30px; font-weight: bold; margin-right: 10px;">1</div>
    Build real-time data pipelines
  </li>
  <li style="margin-bottom: 10px; display: flex; align-items: center;">
    <div class="badge" style="height: 30px; width: 30px; border-radius: 50%; background: #fcba33; color: white; text-align: center; line-height: 30px; font-weight: bold; margin-right: 10px;">2</div>
    Retrieve vectors & features
  </li>
  <li style="margin-bottom: 10px; display: flex; align-items: center;">
    <div class="badge" style="height: 30px; width: 30px; border-radius: 50%; background: #fcba33; color: white; text-align: center; line-height: 30px; font-weight: bold; margin-right: 10px;">3</div>
    Create AI agent tools
  </li>
  <li style="margin-bottom: 10px; display: flex; align-items: center;">
    <div class="badge" style="height: 30px; width: 30px; border-radius: 50%; background: #fcba33; color: white; text-align: center; line-height: 30px; font-weight: bold; margin-right: 10px;">4</div>
    Build & deploy agents
  </li>
  <li style="margin-bottom: 10px; display: flex; align-items: center;">
    <div class="badge" style="height: 30px; width: 30px; border-radius: 50%; background: #fcba33; color: white; text-align: center; line-height: 30px; font-weight: bold; margin-right: 10px;">5</div>
    Operate in batch or real-time
  </li>
  <li style="display: flex; align-items: center;">
    <div class="badge" style="height: 30px; width: 30px; border-radius: 50%; background: #fcba33; color: white; text-align: center; line-height: 30px; font-weight: bold; margin-right: 10px;">6</div>
    Evaluate agent performance
  </li>
</ul>
</div>

<img width="1px" src="https://ppxrzfxige.execute-api.us-west-2.amazonaws.com/v1/analytics?category=lakehouse&org_id=4003492105941350&notebook=%2F05-Generative-AI%2F05.1-ai-tools-iot-turbine-prescriptive-maintenance&demo_name=lakehouse-iot-platform&event=VIEW&path=%2F_dbdemos%2Flakehouse%2Flakehouse-iot-platform%2F05-Generative-AI%2F05.1-ai-tools-iot-turbine-prescriptive-maintenance&version=1">

## Building Agent Systems with Databricks Mosaic AI agent framework

üÜì **This notebook is 100% compatible with Databricks Free Edition!**

We will build an Agent System designed to generate prescriptive work orders for wind turbine maintenance technicians. This system integrates multiple interacting components to ensure proactive and efficient maintenance, thereby optimizing the overall equipment effectiveness.

<img src="https://raw.githubusercontent.com/databricks-demos/dbdemos-resources/refs/heads/main/images/manufacturing/lakehouse-iot-turbine/iot_agent_graph_v2_0.png" style="margin-left: 5px; float: right"  width="1000px;">

Databricks simplifies this by providing a built-in service to:

- Create and store your AI tools leveraging UC functions
- Execute the AI tools in a safe way
- Use agents to reason about the tools you selected and chain them together to properly answer your question. 


This notebook creates **TWO** Mosaic AI tools (Unity Catalog functions), which will be composed together into an agent in notebook [05.2-agent-creation-guide]($./05.2-agent-creation-guide):

1. **Turbine specifications retriever** - Retrieve the turbine specifications based on its ID (SQL function)
2. **Turbine maintenance predictor** - Uses a Model Serving endpoint to predict turbines at risk of failure (SQL function using ai_query)

‚ö†Ô∏è **Note about Free Edition:** The original demo included a 3rd tool using Vector Search for maintenance guide retrieval. Vector Search is **not available in Databricks Free Edition**, so we focus on the two tools above which work perfectly in Free Edition!


In [0]:
%pip install databricks-feature-engineering==0.8.0 databricks-sdk==0.40.0 
# Note: databricks-vectorsearch removed - not available in Free Edition
dbutils.library.restartPython()


[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


In [0]:
%run ../_resources/00-setup $reset_all_data=false

## Configuration file

Please change your catalog and schema here to run the demo on a different catalog.

 
<!-- Collect usage data (view). Remove it to disable collection. View README for more details.  -->
<img width="1px" src="https://ppxrzfxige.execute-api.us-west-2.amazonaws.com/v1/analytics?category=lakehouse&org_id=4003492105941350&notebook=%2Fconfig&demo_name=lakehouse-iot-platform&event=VIEW&path=%2F_dbdemos%2Flakehouse%2Flakehouse-iot-platform%2Fconfig&version=1">


# Technical Setup notebook. Hide this cell results
Initialize dataset to the current user and cleanup data when reset_all_data is set to true

Do not edit

USE CATALOG `main`
using catalog.database `main`.`e2eai_iot_turbine`


data already existing. Run with reset_all_data=true to force a data cleanup for your local demo.


## Part 1: Create the Turbine Specification Retriever as a tool to return sensor readings for a turbine

Edit the FROM table if you changed from the default catalog/schema in your config file.

In [0]:
%sql
DROP FUNCTION IF EXISTS turbine_specifications_retriever;

--turbine_specifications_retriever to get the current status of a turbine
--This function is used to retrieve the turbine specifications based on its id

CREATE OR REPLACE FUNCTION 
turbine_specifications_retriever(turbine_id STRING COMMENT 'ID of the wind turbine to look up')
RETURNS TABLE (
  avg_energy DOUBLE COMMENT 'Average energy reading',
  std_sensor_A DOUBLE COMMENT 'Sensor A reading',
  std_sensor_B DOUBLE COMMENT 'Sensor B reading',
  std_sensor_C DOUBLE COMMENT 'Sensor C reading',
  std_sensor_D DOUBLE COMMENT 'Sensor D reading',
  std_sensor_E DOUBLE COMMENT 'Sensor E reading',
  std_sensor_F DOUBLE COMMENT 'Sensor F reading'
)
LANGUAGE SQL
COMMENT 'This function retrieves the turbine sensor readings / specifications based on the turbine_id'
RETURN
(
SELECT 

avg_energy, std_sensor_A, std_sensor_B, std_sensor_C, std_sensor_D, std_sensor_E, std_sensor_F
FROM main.e2eai_iot_turbine.turbine_current_features
WHERE turbine_id = turbine_specifications_retriever.turbine_id
SORT BY hourly_timestamp DESC
limit 1
);

Now, test our tool:

In [0]:
%sql
SELECT * FROM turbine_specifications_retriever('004a641f-e9e5-9fff-d421-1bf88319420b')

avg_energy,std_sensor_A,std_sensor_B,std_sensor_C,std_sensor_D,std_sensor_E,std_sensor_F
0.074818609038791,1.058048335093487,2.4852932716249665,2.8927160852893152,2.1567050955955853,2.2120358529793696,5.614526027139428


## Part 2: Create the Turbine Predictor as a tool to predict turbine failure

<img src="https://raw.githubusercontent.com/databricks-demos/dbdemos-resources/refs/heads/main/images/manufacturing/lakehouse-iot-turbine/iot_agent_graph_v2_1.png" style="float: right; width: 600px; margin-left: 10px">

To enable our Agent System to predict turbine failtures based on industrial IoT sensor readings, we will rely on the model we deployed previously in the  [./04.3-running-inference-iot-turbine]($./04.3-running-inference-iot-turbine) notebook. 

**Make sure you run this ML notebook to create the model serving endpoint!**


### Using the Model Serving as tool to predict faulty turbines
Let's define the turbine predictor tool function our LLM agent will be able to execute. 

AI agents use [AI Agent Tools](https://docs.databricks.com/en/generative-ai/create-log-agent.html#create-ai-agent-tools) to perform actions besides language generation, for example to retrieve structured or unstructured data, execute code, or talk to remote services (e.g. send an email or Slack message). 

These functions can contain any logic, from simple SQL to advanced python. Below we wrap the model serving endpoint in a SQL function using '[ai_query](https://docs.databricks.com/en/sql/language-manual/functions/ai_query.html)' function, as we tested in the previous notebook.

In [0]:
%sql
DROP FUNCTION IF EXISTS turbine_maintenance_predictor;

--Use turbine_maintenance_predictor to get a prediction of whether or not a turbine sensor is faulty to facilitate proactive maintenance
--This function is used to predict turbine maintenance based on energy and sensor readings

CREATE OR REPLACE FUNCTION 
turbine_maintenance_predictor(avg_energy DOUBLE, 
                              std_sensor_A DOUBLE, 
                              std_sensor_B DOUBLE, 
                              std_sensor_C DOUBLE, 
                              std_sensor_D DOUBLE, 
                              std_sensor_E DOUBLE, 
                              std_sensor_F DOUBLE
)
RETURNS STRING
LANGUAGE SQL
COMMENT 'This tool predicts whether or not a turbine is faulty to facilitate proactive maintenance. It expects an array of 7 double values (energy and sensor readings) as input and returns a string indicating which sensor is predicted to be faulty or if all sensors are ok.'
RETURN
(
    SELECT 
        -- The xgboost model returns a float; translate it back to a string
        CASE WHEN float_prediction=0 THEN "F"
            WHEN float_prediction=1 THEN "ok"
            WHEN float_prediction=2 THEN "B"
            WHEN float_prediction=3 THEN "D"
        ELSE "faulty" END AS prediction    
    FROM (
    SELECT ai_query('e2eai_iot_turbine_prediction_endpoint',
        STRUCT(avg_energy AS avg_energy,
            std_sensor_A AS std_sensor_A,
            std_sensor_B AS std_sensor_B,
            std_sensor_C AS std_sensor_C,
            std_sensor_D AS std_sensor_D,
            std_sensor_E AS std_sensor_E,
            std_sensor_F AS std_sensor_F
        ),
        'FLOAT'
    ) AS float_prediction)
);

Now, test our tool.

In [0]:
%sql
SELECT turbine_maintenance_predictor(
  0.9000803742589635,                           -- avg_energy
  2.2081154200781867,                           -- std_sensor_A
  2.6012126574143823,                           -- std_sensor_B
  2.1075958066966423,                           -- std_sensor_C
  2.2081154200781867,                           -- std_sensor_D
  2.6012126574143823,                           -- std_sensor_E
  2.1075958066966423                            -- std_sensor_F
) AS prediction

Now build an alternate tool using the python code we just tested. 

**You will need to add an API TOKEN and API ROOT before running this code.** The notebook API_TOKEN that we used for testing the python code above will not work.  Instead, create a [Personal Access Token](https://docs.databricks.com/aws/en/dev-tools/auth/pat).

In [0]:
%sql
CREATE OR REPLACE FUNCTION turbine_maintenance_predictor(sensor_values ARRAY<DOUBLE>)
RETURNS STRING
LANGUAGE PYTHON
COMMENT 'This tool predicts whether or not a turbine is faulty to facilitate proactive maintenance. It expects an array of 7 double values (energy and sensor readings) as input and returns a string indicating if a particular sensor is predicted to be faulty or if all sensors are ok.'
AS 
$$

import numpy as np
import pandas as pd
import json 
import requests

#API TOKEN AND URL HERE

api_token = ""
api_root = ""

model_serving_endpoint_name = 'e2eai_iot_turbine_prediction_endpoint'

columns = ['avg_energy', 'std_sensor_A', 'std_sensor_B', 'std_sensor_C', 'std_sensor_D', 'std_sensor_E', 'std_sensor_F']

samp_ar = np.array([sensor_values])

data = pd.DataFrame(samp_ar, columns=columns)

url = f'{api_root}/serving-endpoints/{model_serving_endpoint_name}/invocations'

headers = {'Authorization': f'Bearer {api_token}', 
            'Content-Type': 'application/json'}


ds_dict = {'dataframe_split': data.to_dict(orient='split')} if isinstance(data, pd.DataFrame) else tf_serving_json

data_json = json.dumps(ds_dict, allow_nan=True)

response = requests.request(method='POST', headers=headers, url=url, data=data_json)

if response.status_code != 200:
    raise Exception(f'Request failed with status {response.status_code}, {response.text}')

if response.json()['predictions'][0] == 0:
    return 'Sensor F fault'
elif response.json()['predictions'][0] == 1:
    return 'ok'
elif response.json()['predictions'][0] == 2:
    return 'Sensor B fault'
elif response.json()['predictions'][0] == 3:
    return 'Sensor D fault'
else:
    return 'faulty'

$$ 


Test the python/SQL tool.

In [0]:
%sql
SELECT turbine_maintenance_predictor(array(0.1889792, 
                                           0.9644652, 
                                           2.65583866, 
                                           3.4528106, 
                                           2.48515875,
                                           2.28840325, 
                                           4.70213899)) as prediction

prediction
Sensor F fault


Our agent should call turbine_specifications_retriever() to get sensor readings, then call turbine_maintenance_predictor() to get a prediction.

## What's next: Create your Agent with Databricks Playground

‚úÖ **Free Edition Ready!** Now that we have our 2 AI Tools created and registered in Unity Catalog (turbine_specifications_retriever and turbine_maintenance_predictor), we can compose them into an agent system using the Mosaic AI agent framework.

Open the [05.2-agent-creation-guide]($./05.2-agent-creation-guide) notebook to create and deploy the agent in Databricks Playground - 100% Free Edition compatible!


In [0]:
df = spark.sql(f"SELECT * FROM main.e2eai_iot_turbine.turbine_hourly_features LIMIT 10")
display(df)

turbine_id,hourly_timestamp,avg_energy,std_sensor_A,std_sensor_B,std_sensor_C,std_sensor_D,std_sensor_E,std_sensor_F,location,model,state,abnormal_sensor
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T17:00:00.000Z,0.1889792040091697,0.9644652043128558,2.6558386572409103,3.4528106013576214,2.485158752607405,2.2884032468369284,4.702138990110717,Tupelo,EpicWind,America/Chicago,sensor_F
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T18:00:00.000Z,0.1921225762992177,1.0681855556261903,2.3848184303882847,3.303412042721332,2.172251292324001,2.342593019596896,4.870875418724548,Tupelo,EpicWind,America/Chicago,sensor_F
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T19:00:00.000Z,0.1735634457450677,1.1420887720146298,2.062708699095104,3.019329663712003,2.339552044868049,2.7306978700770164,4.237196637787606,Tupelo,EpicWind,America/Chicago,sensor_F
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T20:00:00.000Z,0.1034340926271473,1.0498727154061804,2.219216509159497,3.246726138931612,2.3204665834317817,2.662700177613455,4.289404582190178,Tupelo,EpicWind,America/Chicago,sensor_F
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T21:00:00.000Z,0.1548124352749333,1.0325552090494656,2.142101655549623,2.7298423212662217,2.3597486817214515,2.761466398058171,4.588788770497015,Tupelo,EpicWind,America/Chicago,sensor_F
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T22:00:00.000Z,0.0847723255024208,1.0021697211227565,2.0968943765292085,2.921547258775341,2.477840322666964,2.9466029618007314,4.357159925464822,Tupelo,EpicWind,America/Chicago,sensor_F
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T23:00:00.000Z,0.074818609038791,1.058048335093487,2.4852932716249665,2.8927160852893152,2.1567050955955853,2.2120358529793696,5.614526027139428,Tupelo,EpicWind,America/Chicago,sensor_F
00f27248-1f4f-e174-432c-53bd2a9158df,2024-01-16T17:00:00.000Z,0.1283965372105728,1.065608883199752,1.9263319253102171,3.3330563526547747,2.230040196141461,2.354626086386649,1.8913049031607985,Crystal Lake,EpicWind,America/Chicago,ok
00f27248-1f4f-e174-432c-53bd2a9158df,2024-01-16T18:00:00.000Z,0.8542245491303897,1.080309777815946,1.9618452098136363,2.9717426105145472,2.306627597988137,2.5166973688595817,1.980452948870913,Crystal Lake,EpicWind,America/Chicago,ok
00f27248-1f4f-e174-432c-53bd2a9158df,2024-01-16T19:00:00.000Z,0.4915535666395597,1.0646332592567709,2.2186746553400307,3.3459438407963438,2.2847856939507167,2.5560343320959498,1.9519204325253467,Crystal Lake,EpicWind,America/Chicago,ok
