#Step 0: Set up

You can use serverless compute to run this notebook or any other classical compute set up

#Introduction

This is the final notebook from the DSA blogpost here: 

We will use DSPy to use everything you just created in the past 3 notebooks. DSPy's declarative framework and dspy.Module allows us to tightly modularize our code and use them interchangeably, not just in this code, but anywhere else. 

For example, you will be create a module to do a function call to a Genie Space. You can take that same module and use it elsewhere with no code change. You will see this in action below 

After this notebook, you will be able to: 
1. Use DSPy to call any LLM regardless of the provider 
2. Create dspy.Signatures to programatically develop instructions for your LLM
3. Create dspy.Modules to put together your entire application 
4. Have a functioning assistant that can use any kind of data about a Patient!

#Install the Dependencies

In [0]:
%pip install --upgrade databricks-sdk mlflow databricks-vectorsearch dspy pillow litellm openai
dbutils.library.restartPython()

Collecting litellm
  Downloading litellm-1.72.1-py3-none-any.whl.metadata (39 kB)
Downloading litellm-1.72.1-py3-none-any.whl (8.0 MB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/8.0 MB[0m [31m?[0m eta [36m-:--:--[0m
[2K   [91m━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/8.0 MB[0m [31m13.0 MB/s[0m eta [36m0:00:01[0m
[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m7.9/8.0 MB[0m [31m22.6 MB/s[0m eta [36m0:00:01[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.0/8.0 MB[0m [31m20.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: litellm
  Attempting uninstall: litellm
    Found existing installation: litellm 1.72.0
    Uninstalling litellm-1.72.0:
      Successfully uninstalled litellm-1.72.0
Successfully installed litellm-1.72.1
[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


In [0]:
import mlflow
import mlflow.deployments
import base64
import io
from PIL import Image
import dspy
from databricks.vector_search.client import VectorSearchClient

client = mlflow.deployments.get_deploy_client("databricks")
mlflow.dspy.autolog() 

#Configure an LLM provider for DSPy

You can use dspy.LM to specify what LLM you would like to use. If you wish to change what LLM and what provider you want to use, you can use dspy.configure() to change it. No other code needs to be updated

In [0]:
import dspy

claude = dspy.LM('databricks/databricks-claude-3-7-sonnet', cache=False)
claude_anthropic = dspy.LM('anthropic/claude-sonnet-4-20250514', api_key="", cache=False)
# llama = dspy.LM('databricks/databricks-llama-4-maverick', cache=False)
dspy.configure(lm=claude_anthropic)

In [0]:
from config import volume_label, volume_name, catalog, schema, model_name, model_endpoint_name, embedding_table_name, embedding_table_name_index, registered_model_name, vector_search_endpoint_name

#Testing our tools

Let's instantiate our tools and test that they still work

##Vector Search Index

In [0]:
vs_client = VectorSearchClient()

vector_search_endpoint_name = "one-env-shared-endpoint-4"
index_name = f"{embedding_table_name}_index"

[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.


In [0]:
index = vs_client.get_index(endpoint_name=vector_search_endpoint_name, index_name=f"{catalog}.{schema}.{index_name}")

##Vector Search Index + Model Serving Endpoint

In [0]:
input_query = "what is my deductible for united healthcare"

response = client.predict(
            endpoint=model_endpoint_name,
            inputs={"dataframe_split": {
                    "columns": ["text"],
                    "data": [[input_query]]
                    }
            }
          )

query_embedding = response['predictions']['predictions']['embedding']
results = index.similarity_search(num_results=5, columns=["base64_image"], query_vector=query_embedding)

[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.


In [0]:
base64_test_retrieved = results['result']['data_array'][0][0]
print(base64_test_retrieved) #the matching image vector search retrieved

/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAoHBwgHBgoICAgLCgoLDhgQDg0NDh0VFhEYIx8lJCIfIiEmKzcvJik0KSEiMEExNDk7Pj4+JS5ESUM8SDc9Pjv/2wBDAQoLCw4NDhwQEBw7KCIoOzs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozv/wAARCAMAA+EDASIAAhEBAxEB/8QAHAAAAgIDAQEAAAAAAAAAAAAAAAUEBgECAwcI/8QAXhAAAQMDAgMEBAgKBQkGAwUJAQIDBAAFERIhBhMxFCJBURUyYdIWU1VxgZGSlAcXI0JUdKGz0dMzNFKTsiQ1NkNicpWjsSVWc4LB8GTh8SZFRmNlhaJEg4TCw3U3/8QAGQEBAQEBAQEAAAAAAAAAAAAAAAECAwQF/8QANBEBAAIBAwMCBAUDBAIDAAAAAAERAgMSIRMxURTwBEFh0XGRobHBUoHhIjLi8SMzQmKS/9oADAMBAAIRAxEAPwD2aiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigKKKKAooooCiiigK

#Check your Genie Space

Change the genie_space_id to the space you created in the 3rd notebook then run the cell. It should return some information about a patient. If it does not work initially, go into your data and find a patient it created as the same patient name may not have been generated

In [0]:
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.dashboards import GenieAPI

def hls_patient_genie(sql_instruction):

  w = WorkspaceClient()
  genie_space_id = "01effef4c7e113f9b8952cf568b49ac7" #replace this with your genie space ID that you created

  conversation = w.genie.start_conversation_and_wait(
      space_id=genie_space_id,
      content=f"{sql_instruction} always limit to one result"
  )

  response = w.genie.get_message_attachment_query_result(
    space_id=genie_space_id,
    conversation_id=conversation.conversation_id,
    message_id=conversation.message_id,
    attachment_id=conversation.attachments[0].attachment_id
  )

  return response.statement_response.result.data_array
hls_result = hls_patient_genie(sql_instruction="Find information about Richard Massey")
print(hls_result)

[['Richard', 'Massey', 'UnitedHealth', 'HMO', None, 'alawrence@example.org', 'LA', '8216', 'Those government continue charge recognize decide seem again.', 'Physical Therapy']]


#DSPy development

Now that we confirmed the resources work. Let's define the dspy.Signatures. 

Signature replace prompt engineering. Your goal for a signature is providing the parameters and defining an input-output relationship. Each signature should be clear on how it is achieving something. 

Each signature can also be used interchangeably with different modules and with each other. 

Below we will define 3 signatures: 
1. Extract_Keywords: To send more relevant words to the vector search index, we will first extract keywords from the text input 
2. Genie_Call: To send the text query to the genie space and understand what insurance type someone has
3. image_analyzer: We will take the results of the vector search index and the original text query, send these to the LLM to interpret and create a response

**Goal**: Identify the deductible of the patient's insurance

**Problem**: Our tables the Genie Space queries does not have a column called deductible. This is located in the PDFs containing insurance information. Based on the input text query, we need to figure out: 
1. Who the Patient is
2. What we are looking for
3. Where to get it 

**Solution**: Let the agent take the text_query, find the patient and the keyword showing what we are looking for. Then, extract these keywords to send to our Genie and Vector Search Tools to find the necessary information. We consolidate this together and let the LLM read through the information to come to a conclusion. 


In [0]:
class image_analyzer(dspy.Signature):
  """review the image and genie_patient_response to answer the text_query"""
  image: dspy.Image = dspy.InputField() 
  genie_patient_response: list = dspy.InputField()
  text_query: str = dspy.InputField()
  response: str = dspy.OutputField() 
  deductible: str = dspy.OutputField()

class patient_information_extraction(dspy.Signature):
  """This class only extracts and returns information from relevant tools based on the text_query. Include relevant information from the genie_patient_response in the keywords_for_vector_search"""
  text_query: str = dspy.InputField()
  genie_patient_response: list = dspy.OutputField()
  keywords_for_vector_search: str = dspy.OutputField(desc="string of keywords to pass to vector search")

class MultiModalPatientInsuranceAnalyzer(dspy.Module):
  def __init__(self):
    super().__init__()
    self.image_analyzer = dspy.Predict(image_analyzer)
    self.patient_information_extraction = dspy.ReAct(patient_information_extraction, tools=[self.hls_patient_genie], max_iters=1)
  
  def process_image(self, base64_string):
    image_data = base64.b64decode(base64_string) 
    pil_image = Image.open(io.BytesIO(image_data))
    dspy_image = dspy.Image.from_PIL(pil_image)
    return dspy_image
  
  def vector_search_for_patient_pdf(self, text_query):
    """Pulls matching Insurance Documents based on the text_query"""
    client = mlflow.deployments.get_deploy_client("databricks") 
    response = client.predict(
              endpoint=model_endpoint_name,
              inputs={"dataframe_split": {
                      "columns": ["text"],
                      "data": [[text_query]]
                      }
              }
            )
    text_embedding = response['predictions']['predictions']['embedding']
    index = vs_client.get_index(endpoint_name=vector_search_endpoint_name, index_name=f"{catalog}.{schema}.{index_name}")
    results = index.similarity_search(num_results=3, columns=["base64_image"], query_vector=text_embedding)
    return results['result']['data_array'][0][0]
  
  def hls_patient_genie(self, patient_name):
    """Pull Patient information based on the patient's name"""
    w = WorkspaceClient()
    genie_space_id = "01effef4c7e113f9b8952cf568b49ac7" #replace this with your genie space ID that you created

    conversation = w.genie.start_conversation_and_wait(
        space_id=genie_space_id,
        content=f"Find any details about {patient_name}. Limit your answer to one result."
    )

    response = w.genie.get_message_attachment_query_result(
      space_id=genie_space_id,
      conversation_id=conversation.conversation_id,
      message_id=conversation.message_id,
      attachment_id=conversation.attachments[0].attachment_id
    )

    return response.statement_response.result.data_array


  def forward(self, text_query: str):
    results = self.patient_information_extraction(text_query=text_query)
    base64_str = self.vector_search_for_patient_pdf(text_query=results.keywords_for_vector_search)
    dspy_image = self.process_image(base64_string=base64_str)
    return self.image_analyzer(image=dspy_image, genie_patient_response=results.genie_patient_response, text_query=text_query)

#Let's run it!

This agent will follow the following process: 
1. Identify the patient in the text_query
2. Identify keywords to send to the vector search 
3. Use the results of the genie space to better retrieve relevant insurance document information as the insurance could be a HMO, PPO or EPO plan. 
3. Take the base64 image from the vector search index and the results of the genie to properly answer the question
4. Provide a response and deductible as we programmed so that we can access the deductuble value

In [0]:
analyzer = MultiModalPatientInsuranceAnalyzer()
analyzer_output = analyzer(text_query="what is Ashley Hall's overall deductible?")

[92m00:44:01 - LiteLLM:INFO[0m: utils.py:3043 - 
LiteLLM completion() model= claude-sonnet-4-20250514; provider = anthropic
INFO:LiteLLM:
LiteLLM completion() model= claude-sonnet-4-20250514; provider = anthropic
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
[92m00:44:04 - LiteLLM:INFO[0m: utils.py:1215 - Wrapper: Completed Call, calling success_handler
INFO:LiteLLM:Wrapper: Completed Call, calling success_handler
[92m00:44:04 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cost calculation: anthropic/claude-sonnet-4-20250514
INFO:LiteLLM:selected model name for cost calculation: anthropic/claude-sonnet-4-20250514
[92m00:44:04 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cost calculation: anthropic/claude-sonnet-4-20250514
INFO:LiteLLM:selected model name for cost calculation: anthropic/claude-sonnet-4-20250514
[92m00:44:04 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cos

[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.


[92m00:44:19 - LiteLLM:INFO[0m: utils.py:3043 - 
LiteLLM completion() model= claude-sonnet-4-20250514; provider = anthropic
INFO:LiteLLM:
LiteLLM completion() model= claude-sonnet-4-20250514; provider = anthropic
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
[92m00:44:23 - LiteLLM:INFO[0m: utils.py:1215 - Wrapper: Completed Call, calling success_handler
INFO:LiteLLM:Wrapper: Completed Call, calling success_handler
[92m00:44:23 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cost calculation: anthropic/claude-sonnet-4-20250514
INFO:LiteLLM:selected model name for cost calculation: anthropic/claude-sonnet-4-20250514
[92m00:44:23 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cost calculation: anthropic/claude-sonnet-4-20250514
INFO:LiteLLM:selected model name for cost calculation: anthropic/claude-sonnet-4-20250514
[92m00:44:23 - LiteLLM:INFO[0m: cost_calculator.py:655 - selected model name for cos

Trace(request_id=tr-a41433e3009b45c983de7fb693c78d63)

In [0]:
print(f"The LLM Response: {analyzer_output.response}\n\nThe Deductible: {analyzer_output.deductible}")

The LLM Response: Based on the Summary of Benefits and Coverage document, Ashley Hall's overall deductible is $500 for an individual or $1,000 for a family. Since this appears to be a family plan (Plan Type: PPO), the family deductible would be $1,000, but each individual family member has their own $500 deductible that contributes to the overall family deductible amount.

The Deductible: $500 individual / $1,000 family


Downstream, we could use the output and update the Delta Table the Genie Space uses allowing us the use a combination of unstructured and structured data to enrich our data sources!