# Demo: Using Agent Bricks for Root Cause Analysis on Medical Device Reports

## 1. Introduction to Agent Bricks
Agent Bricks is a Databricks-native framework for building and deploying generative AI agents.  
- It enables extraction of structured insights from unstructured text.  
- Supports use cases like **information extraction, classification, summarization, and knowledge assistance**.  
- Agents can be deployed and queried interactively or in batch, making it easy to integrate AI into analytics pipelines.

---

## 2. Why Use Agent Bricks for Our Dataset?
Our dataset contains medical device adverse event reports with long, unstructured text in `adverse_event_description`.  
- Challenges: Free-text descriptions are lengthy and complex.  
- Goal: Extract actionable insights for **root cause analysis (RCA)**.  

**Agent Bricks transforms these text fields into structured, queryable columns**, making analysis and reporting easy.

---

## 3. Output of the Agent
After running the Agent Brick on the dataset, the **structured output table** contains the following columns:

| Column | Description |
|--------|-------------|
| `device_name` | Name of the medical device involved in the adverse event or recall |
| `manufacturer` | Manufacturer of the device |
| `reason_for_recall` | Concise explanation of why the device was recalled or failed |
| `adverse_events` | Description of adverse events reported, including patient impact |
| `affected_devices` | Devices or models affected by the issue or recall |
| `actions_to_be_taken` | Recommended actions, such as device returns, updates, or patient notifications |

- Arrays like `actions_to_be_taken` are **flattened into readable strings**.  
- The table is **interactive** and demo-ready in Databricks notebooks.

---

## 4. High-Level Steps
1. **Prepare the data**: Ensure your table `mma_fe_innovation.mma.device_adverse_reports_silver` is ready.  
2. **Create the Agent**: Use the Information Extraction template in Agent Bricks.  
3. **Improve Quality**: Provide examples, set global instructions (valid JSON, concise phrases, no hallucinations).  
4. **Deploy the Agent**: Generate an endpoint for SQL, Python, or UI queries.  
5. **Run Batch Inference**: Use `ai_query` to process `adverse_event_description` into the structured columns above.  
6. **Analyze Results**: Use the clean table to explore root causes, affected devices, adverse events, and recommended actions.

---

## 5. Interactive Queries
You can also ask the agent **natural language questions** like:  
- “Summarize root causes and actions for all Tuttnauer Autoclave reports.”  
- “List all devices with patient infection adverse events.”  
- “Show affected devices and clinical impacts for high severity recalls.”  

The agent responds in **structured JSON or text**, which can be displayed as a clean table for analysis.

---

This markdown now accurately reflects your **actual output columns** for your final demo.


##Testing the Agent endpoint using  ai_query:

In [0]:
%sql
SELECT
  adverse_event_description,
  ai_query(
    'kie-aaf115b7-endpoint',
    CONCAT(
      'Extract root cause, component, and symptom from the following text: ',
      adverse_event_description
    )
  ) AS extracted_rca
FROM mma_fe_innovation.mma.device_adverse_reports_silver
LIMIT 10;


In [0]:
from pyspark.sql.functions import from_json, col, concat_ws
from pyspark.sql.types import StructType, StructField, StringType, ArrayType

# Define schema for Agent Brick output
rca_schema = StructType([
    StructField("device_name", StringType(), True),
    StructField("manufacturer", StringType(), True),
    StructField("model_numbers", ArrayType(StringType()), True),
    StructField("distribution_dates", StringType(), True),
    StructField("reason_for_recall", StringType(), True),
    StructField("adverse_events", StringType(), True),
    StructField("affected_devices", StringType(), True),
    StructField("actions_to_be_taken", ArrayType(StringType()), True),
    StructField("contact_information", StringType(), True),
    StructField("fda_reference_number", StringType(), True)
])

# Step 1: Run SQL query calling the Agent Brick endpoint
query = """
SELECT
  device_name,
  adverse_event_description,
  ai_query(
    'kie-aaf115b7-endpoint',
    CONCAT(
      'Extract structured recall and adverse event information from the following text: ',
      adverse_event_description
    )
  ) AS extracted_rca
FROM mma_fe_innovation.mma.device_adverse_reports_silver
LIMIT 10
"""

df_sql = spark.sql(query)

# Step 2: Cast VARIANT to STRING and parse JSON
df_parsed = df_sql.withColumn(
    "extracted_rca_str",
    col("extracted_rca").cast("string")
).withColumn(
    "rca_struct",
    from_json(col("extracted_rca_str"), rca_schema)
)

# Step 3: Select only columns from parsed JSON and top-level columns
df_clean = df_parsed.select(
    col("rca_struct.device_name").alias("device_name"),
    col("rca_struct.manufacturer").alias("manufacturer"),
    col("rca_struct.reason_for_recall").alias("reason_for_recall"),
    col("rca_struct.adverse_events").alias("adverse_events"),
    col("rca_struct.affected_devices").alias("affected_devices"),
    concat_ws("; ", col("rca_struct.actions_to_be_taken")).alias("actions_to_be_taken")
)

# Step 4: Show clean, demo-ready table
df_clean.show(truncate=20)


In [0]:
display(df_clean)