# Agent

Use this notebook to iterate on the code and configuration of your Agent.

By the end of this notebook, you will have 1+ registered versions of your Agent, each coupled with a detailed quality evaluation.  To interact with this Agent through the Playground or share with your business stakeholders for feedback, use the following notebooks.

For each version, you will have an MLflow run inside your MLflow experiment that contains:
- Your Agent's code & config
- Evaluation metrics for cost, quality, and latency

## 👉 START HERE: How to use this notebook

We suggest the following approach to using this notebook to build and iterate on your Agent's quality.
1. Build an initial version of your Agent by tweaking the smart default settings and code in this notebook.

2. Vibe check & iterate on the Agent's quality to reach a "not embarassingly bad" level of quality.  Test 5 - 10 questions using MLflow Tracing & Agent Evaluation's quality root cause analysis to guide your iteration.

3. Use the later notebooks to share your Agent with stakeholders to collect feedback that you will turn into an evaluation set with questions/correct responses labeled by your stakeholders.

4. Use this notebook to Agent Evaluation using this evaluation set.

5. Same as step 2, use MLflow Tracing & Agent Evaluation's quality root cause analysis to guide your iteration.  Iteratively try and evaluate various strategies to improve the quality of your agent and/or retriever.  For a deep dive on these strategies, view AI cookbook's [retrieval](https://ai-cookbook.io/nbs/5-hands-on-improve-quality-step-1-retrieval.html) and [generation](https://ai-cookbook.io/nbs/5-hands-on-improve-quality-step-1-generation.html) guides.

6. Repeat step 3 to collect more feedback, then repeat steps 4 and 5 to further improve quality


**Important note:** Throughout this notebook, we indicate which cell's code you:
- ✅✏️ should customize - these cells contain code & config with business logic that you should edit to meet your requirements & tune quality.
- 🚫✏️ should not customize - these cells contain boilerplate code required to load/save/execute your Agent

*Cells that don't require customization still need to be run!  You CAN change these cells, but if this is the first time using this notebook, we suggest not doing so.*

### 🚫✏️ Install Python libraries

You do not need to modify this cell unless you need additional Python packages in your Agent.

In [0]:
%pip install -qqqq -U -r requirements.txt
# Restart to load the packages into the Python environment
dbutils.library.restartPython()

In [1]:
# Shared imports
from datetime import datetime
# from IPython.display import display_markdown

### 🚫✏️ Connect to Databricks

If running locally in an IDE using Databricks Connect, connect the Spark client & configure MLflow to use Databricks Managed MLflow.  If this running in a Databricks Notebook, these values are already set.

In [2]:
from mlflow.utils import databricks_utils as du

if not du.is_in_databricks_notebook():
    from databricks.connect import DatabricksSession
    import os

    spark = DatabricksSession.builder.getOrCreate()
    os.environ["MLFLOW_TRACKING_URI"] = "databricks"

### 🚫✏️ Load the Agent's storage locations

This notebook uses the UC model, MLflow Experiment, and Evaluation Set that you specified in the [Agent setup](02_agent_setup.ipynb) notebook.

In [1]:
from cookbook.config.common.agent_storage_locations import AgentStorageConfig
import mlflow 

# Load the Agent's storage configuration
agent_storage_config = AgentStorageConfig.from_yaml_file('./configs/agent_storage_config.yaml')

mlflow.set_experiment(agent_storage_config.mlflow_experiment_name)

<Experiment: artifact_location='file:///Users/eric.peter/Github/genai-cookbook/agent_app_sample_code/mlruns/875381203371509471', creation_time=1730832700880, experiment_id='875381203371509471', last_update_time=1730832700880, lifecycle_stage='active', name='/Users/eric.peter@databricks.com/my_agent_mlflow_experiment', tags={}>

In [2]:
%load_ext autoreload
%autoreload 2


## 0️⃣ Setup: Load the Agent's configuration that is shared with the other notebooks


## 1️⃣ Iterate on the Agent's code & config to improve quality

The below cells are used to execute your inner dev loop to improve the Agent's quality.

If you are creating this Agent for the first time, you will:
1. Review the smart defaults provided in the Agent's code & configuration
2. Vibe check the Agent for 1 query to verify it works

We suggest the following inner dev loop:
1. Run the Agent for 1+ queries or your evaluation set
2. Determine if the Agent's output is correct for those queries e.g., high quality
3. Based on that assessment, make changes to the code/config to improve quality
4. 🔁 Re-run the Agent for the same queries, repeating this cycle.
5. Once you have a version of the Agent with sufficient quality, log the Agent to MLflow
6. Use the next notebooks to share the Agent with your stakeholders & collect feedback
7. Add stakeholder's queries & feedback to your evaluation set
8. 🔁 Use that evaluation set to repeat this cycle


### ✅✏️ Change the Agent's code & config

#### ✅✏️ ⚙️ Adjust the Agent's configuration

Here, we parameterize your Agent's code with common settings you will tune to improve quality, such as prompts.

> *Note: Our template Agents use [Pydantic](https://docs.pydantic.dev/latest/) models, which are thin wrappers around Python dictionaries.  Pydantic allows us to define the initial parameters Databricks suggests for tuning quality and allows this notebook to validate parameters changes you make. It also provides type-checking for config objects and IDE-friendly property name references.*

We use Pydantic to define tools and support automatically serializing their classnames and configs to YAML that can be
loaded back. To implement new tools, implement a new subclass of `utils.agents.tools.BaseTool` and pass it to `AgentConfig` below.

You can (and often will need to) add or adjust the parameters in our template.  To add/modify/delete a parameter, you can modify the Pydantic classes in modules under `utils.agents`

In [8]:
# Import Pydantic models
from cookbook.config.agents.function_calling_agent import (
    FunctionCallingAgentConfig,
)
from cookbook.config.common.llm import LLMConfig, LLMParametersConfig
from cookbook.config.tools.vector_search_tool import (
    VectorSearchRetriever,
    VectorSearchRetrieverTool,
    VectorSearchRetrieverTool,
    VectorSearchParameters,
    VectorSearchSchema,
)
from agent_app_sample_code._scratch_pad.old_configs import get_agent_dependencies, log_pyfunc_agent
import json
import yaml

# # View Retriever config documentation by inspecting the docstrings
#
# print(VectorSearchRetrieverConfig.__doc__)
# print(RetrieverOutputSchema.__doc__)
#
# # View documentation for the parameters by inspecting the docstring
#
# print(LLMConfig.__doc__)
# print(LLMParametersConfig.__doc__)
# print(AgentConfig.__doc__)

Dev loop

* Create a tool fn in a python file
* develop a bunch of test cases (queries, expected params, tool outputs, expected llm response)
* Unit test that tool locally
* deploy it to the UC
* re-run tests for the UC tool
* test it in the agent alone to make sure params are set right 


In [4]:
%load_ext autoreload
%reload_ext autoreload
%autoreload 2


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [11]:
from databricks.sdk import WorkspaceClient
from databricks.sdk.errors import ResourceDoesNotExist
w = WorkspaceClient()
try:
    t = w.functions.get(name="ep.cookbook_local_test.translate_sku")
    print(t)
except ResourceDoesNotExist:
    print("Function does not exist")


FunctionInfo(browse_only=None, catalog_name='ep', comment='Translates a pre-2024 SKU formatted as "OLD-XXX-YYYY" to the new SKU format "NEW-YYYY-XXX".', created_at=1730834698067, created_by='eric.peter@databricks.com', data_type=<ColumnTypeName.STRING: 'STRING'>, external_language='Python', external_name=None, full_data_type='STRING', full_name='ep.cookbook_local_test.translate_sku', function_id='0681f6af-e649-470c-a13d-717ba2ad5b3c', input_params=FunctionParameterInfos(parameters=[FunctionParameterInfo(name='old_sku', type_text='string', type_name=<ColumnTypeName.STRING: 'STRING'>, position=0, comment='The old SKU in the format "OLD-XXX-YYYY".', parameter_default=None, parameter_mode=None, parameter_type=<FunctionParameterType.PARAM: 'PARAM'>, type_interval_type=None, type_json='{"name":"old_sku","type":"string","nullable":true,"metadata":{"comment":"The old SKU in the format \\"OLD-XXX-YYYY\\"."}}', type_precision=0, type_scale=0)]), is_deterministic=False, is_null_call=None, metasto

In [32]:
%autoreload 2

In [1]:
from cookbook.config.tools.uc_tool import UCTool

tool = UCTool(uc_function_name="ep.cookbook_local_test.translate_sku")

# tool(old_sku="123456")

test = tool.name

json.dumps(tool.get_json_schema())

# print(test)

data = tool.to_yaml()  

with open("./configs/tool_test.yaml", "w") as handle:
    handle.write(data)

In [2]:
from cookbook.config.base import load_serializable_config_from_yaml

with open("./configs/tool_test.yaml", "r") as handle:
    data = handle.read()

model = load_serializable_config_from_yaml(data)

model

UCTool(name='ep__cookbook_local_test__translate_sku', description='Translates a pre-2024 SKU formatted as "OLD-XXX-YYYY" to the new SKU format "NEW-YYYY-XXX".', uc_function_name='ep.cookbook_local_test.translate_sku', error_prompt='Error in generated code.  Please think step-by-step about how to fix the error and try calling this tool again with corrected inputs that reflect this thinking.')

In [26]:

from unitycatalog.ai.openai.toolkit import UCFunctionToolkit
from unitycatalog.ai.core.databricks import DatabricksFunctionClient

uc_client = DatabricksFunctionClient()

f=uc_client.get_function(function_name="ep.cookbook_local_test.translate_sku")

f.comment
f.name

toolkit = UCFunctionToolkit(
            function_names=["ep.cookbook_local_test.translate_sku"], client=uc_client
        )

toolkit.tools[0]



{'type': 'function',
 'function': {'name': 'ep__cookbook_local_test__translate_sku',
  'strict': True,
  'parameters': {'properties': {'old_sku': {'anyOf': [{'type': 'string'},
      {'type': 'null'}],
     'description': 'The old SKU in the format "OLD-XXX-YYYY".',
     'title': 'Old Sku'}},
   'title': 'ep__cookbook_local_test__translate_sku__params',
   'type': 'object',
   'additionalProperties': False,
   'required': ['old_sku']},
  'description': 'Translates a pre-2024 SKU formatted as "OLD-XXX-YYYY" to the new SKU format "NEW-YYYY-XXX".'}}

In [1]:
%load_ext autoreload
%autoreload 2


In [2]:
from cookbook.config.tools.vector_search_tool import VectorSearchRetrieverTool, VectorSearchSchema, VectorSearchParameters
# from utils.agents.tools_2 import load_obj_from_yaml
retriever_config = VectorSearchRetrieverTool(
    name="search_product_docs",
    description="Use this tool to search for product documentation.",
    vector_search_index="ep.cookbook_local_test.my_pdfs_docs_chunked_index__v1",
    vector_search_schema=VectorSearchSchema(
        chunk_text="content_chunked",
        document_uri="doc_uri",
        additional_metadata_columns=["parser_status"]
    ),
    # doc_similarity_threshold=0.0,
    # vector_search_parameters=VectorSearchParameters(
    #     num_results=5,
    #     query_type="ann"
    # ),
    # filterable_columns=["doc_uri", "chunk_id"]
)

retriever_config.filterable_columns_descriptions_for_llm

filters = [{"field": "doc_uri", "filter": "/Volumes/ep/cookbook_local_test/source_docs/Updating RAG Studio to Ingest PDFs.pdf"}]

# retriever_config(query="test", filters=filters)

# data = retriever_config.to_yaml()  

# with open("./configs/test.yaml", "w") as handle:
#     handle.write(data)

# retriever_config.filterable_columns_descriptions_for_llm
# retriever_config.vector_search_schema.all_columns
json.dumps(retriever_config.get_json_schema())

'{"type": "function", "function": {"name": "search_product_docs", "description": "Use this tool to search for product documentation.", "parameters": {"type": "object", "required": ["query"], "additionalProperties": false, "properties": {"query": {"description": "query to look up in retriever", "type": "string"}}}}}'

In [3]:
retriever_config.model_dump()

{'name': 'search_product_docs',
 'description': 'Use this tool to search for product documentation.',
 'vector_search_index': 'ep.cookbook_local_test.my_pdfs_docs_chunked_index__v1',
 'filterable_columns': [],
 'vector_search_schema': {'chunk_text': 'content_chunked',
  'document_uri': 'doc_uri',
  'additional_metadata_columns': ['parser_status']},
 'doc_similarity_threshold': 0.0,
 'vector_search_parameters': {'num_results': 5, 'query_type': 'ann'},
 'retriever_query_parameter_prompt': 'query to look up in retriever',
 'retriever_filter_parameter_prompt': 'optional filters to apply to the search. An array of objects, each specifying a field name and the filters to apply to that field.',
 'class_path': 'utils.agents.vector_search.VectorSearchRetrieverTool'}

In [161]:
from cookbook.config.base import load_serializable_config_from_yaml

with open("./configs/test.yaml", "r") as handle:
    data = handle.read()

model = load_serializable_config_from_yaml(data)

model.vector_search_schema.all_columns

['parser_status', 'content_chunked', 'chunk_id', 'doc_uri']

In [66]:
table = "ep.cookbook_local_test.my_pdfs_docs_chunked__v1_test"
table_info = w.tables.get(table)
# print(f"Table info: {table_info.columns}")
for column in table_info.columns:
    print(f"Column: {column.name}")
    print(f"Column: {column.type_text}")
    print(f"Column: {column.comment}")
    print(len(column.comment))

Column: chunk_id
Column: string
Column: Unique identifier for each chunk of data in the document.
57
Column: content_chunked
Column: string
Column: 
0
Column: parser_status
Column: string
Column: Represents the status of the document parsing process for the chunk.
68
Column: doc_uri
Column: string
Column: The unique identifier for the document, allowing easy reference and tracking.
77
Column: last_modified
Column: timestamp
Column: The timestamp indicating when the chunk was last modified, providing a way to track changes and updates.
104
Column: content_chunked_array
Column: array<string>
Column: None


TypeError: object of type 'NoneType' has no len()

In [40]:
from databricks.sdk.service.vectorsearch import VectorIndexType
index_info = w.vector_search_indexes.get_index("ep.cookbook_local_test.my_pdfs_docs_chunked_index__v1")
index_info = w.vector_search_indexes.get_index("ep.cookbook_local_test.test")

index_info.as_dict()

index_type = index_info.index_type
if index_type == VectorIndexType.DELTA_SYNC:
    source_table = index_info.delta_sync_index_spec.source_table
    print(f"Source table: {source_table}")
    primary_key = index_info.primary_key
    print(f"Primary key: {primary_key}")
    print(index_info)
    table_info = w.tables.get(source_table)
    # print(f"Table info: {table_info.columns}")
    for column in table_info.columns:
        print(f"Column: {column.name}")
        print(f"Column: {column.type_text}")
        print(f"Column: {column.comment}")
        print(len(column.comment))

# index_type

Source table: ep.cookbook_local_test.my_pdfs_docs_chunked__v1
Primary key: chunk_id
VectorIndex(creator='eric.peter@databricks.com', delta_sync_index_spec=DeltaSyncVectorIndexSpecResponse(embedding_source_columns=[EmbeddingSourceColumn(embedding_model_endpoint_name='databricks-gte-large-en', name='content_chunked')], embedding_vector_columns=[], embedding_writeback_table=None, pipeline_id='de6852cf-94f7-4f17-a12d-660d3968398b', pipeline_type=<PipelineType.TRIGGERED: 'TRIGGERED'>, source_table='ep.cookbook_local_test.my_pdfs_docs_chunked__v1'), direct_access_index_spec=None, endpoint_name='ericpeter_vector_search', index_type=<VectorIndexType.DELTA_SYNC: 'DELTA_SYNC'>, name='ep.cookbook_local_test.test', primary_key='chunk_id', status=VectorIndexStatus(index_url='e2-dogfood.staging.cloud.databricks.com/api/2.0/vector-search/endpoints/ericpeter_vector_search/indexes/ep.cookbook_local_test.test', indexed_row_count=1, message='Index creation succeeded. Check latest status: https://e2-dogfo

In [None]:
VectorSearchRetrieverTool(
            vector_search_retriever=VectorSearchRetriever(retriever_config),
            # the prompt used to describe when the tool so the LLM can decide when it is relevant to call.
            tool_description_prompt="Search for documents that are relevant to a user's query about the [REPLACE WITH DESCRIPTION OF YOUR DOCS].",
            # the prompt that describes the tool's name.  Used in combination with `tool_description_prompt` to describe when the tool so the LLM can decide when it is relevant to call.
            tool_name="retrieve_documents",
            # Retriever prompts: Tune these prompts if the Agent uses the retriever incorrectly e.g., doesn't call the retriever tool for the right queries or translates the user's intent to a query incorrectly.
            retriever_query_parameter_prompt="The query to find documents for.",  # the prompt used to describe what inputs should go in the 'query' parameter which is used by the vector index to search for relevant documents
            retriever_filter_parameter_prompt="Optional filters to apply to the search. An array of objects, each specifying a field name and the filters to apply to that field.",
        )



In [58]:
retriever_tool.model_dump()

{'name': 'search_customer_info',
 'description': 'Use this tool to find customer info',
 'vector_search_index': 'ep.cookbook_local_test.my_pdfs_docs_chunked_index__v1',
 'filterable_columns': [],
 'vector_search_schema': {'chunk_text': 'content_chunked',
  'document_uri': 'doc_uri',
  'additional_metadata_columns': ['parser_status']},
 'doc_similarity_threshold': 0.0,
 'vector_search_parameters': {'num_results': 5, 'query_type': 'ann'},
 'retriever_query_parameter_prompt': 'query to look up in retriever',
 'retriever_filter_parameter_prompt': 'optional filters to apply to the search. An array of objects, each specifying a field name and the filters to apply to that field.'}

In [1]:
# Import Pydantic models
from cookbook.config.agents.function_calling_agent import (
    FunctionCallingAgentConfig,
)
from cookbook.config.common.llm import LLMConfig, LLMParametersConfig


########################
# #### 🚫✏️ Load the Vector Index location from the data pipeline configuration
########################

# This loads the Vector Index Unity Catalog location from the data pipeline configuration.

# Usage:
# - If you used `01_data_pipeline` to create your Vector Index, run this cell.
# - If your Vector Index was created elsewhere, skip this cell and set the UC location in the Retriever config.
from cookbook.config.data_pipeline import (
    DataPipelineConfig,
)

data_pipeline_config = DataPipelineConfig.from_yaml_file(
    "./configs/data_pipeline_config.yaml"
)



from cookbook.config.tools.uc_tool import UCTool

translate_sku_tool = UCTool(uc_function_name="ep.cookbook_local_test.translate_sku")

########################
# #### ✅✏️ Retriever tool that connects to the Vector Search index
########################

from cookbook.config.tools.vector_search_tool import VectorSearchRetrieverTool, VectorSearchSchema, VectorSearchParameters
# from utils.agents.tools_2 import load_obj_from_yaml
retriever_tool = VectorSearchRetrieverTool(
    name="search_customer_info",
    description="Use this tool to find customer info",
    vector_search_index=data_pipeline_config.output.vector_index,
    vector_search_schema=VectorSearchSchema(
        chunk_text="content_chunked",
        document_uri="doc_uri",
        additional_metadata_columns=["parser_status"]
    ),
    # doc_similarity_threshold=0.0,
    # vector_search_parameters=VectorSearchParameters(
    #     num_results=5,
    #     query_type="ann"
    # ),
    # filterable_columns=["doc_uri", "chunk_id"]
)

doc_tool = VectorSearchRetrieverTool(
    name="search_product_docs",
    description="Use this tool to search for product documentation.",
    vector_search_index="ericpeter_catalog.agents.db_docs_app_chunked_docs_index",
    vector_search_schema=VectorSearchSchema(
        chunk_text="content_chunked",
        document_uri="doc_uri",
        additional_metadata_columns=["section_headers"]
    ),
    # doc_similarity_threshold=0.0,
    # vector_search_parameters=VectorSearchParameters(
    #     num_results=5,
    #     query_type="ann"
    # ),
    filterable_columns=["section_headers"]
)




# retriever_config = VectorSearchRetrieverTool(
#     vector_search_index=data_pipeline_config.output.vector_index,  # UC Vector Search index
#     # Retriever schema, this is required by Agent Evaluation to:
#     # 1. Enable the Review App to properly display retrieved chunks
#     # 2. Enable metrics / LLM judges to understand which fields to use to measure the retriever
#     # Each is a column name within the `vector_search_index`
#     vector_search_schema=VectorSearchSchema(
#         primary_key="chunk_id",  # The column name in the retriever's response referred to the unique key
#         chunk_text="content_chunked",  # The column name in the retriever's response that contains the returned chunk
#         document_uri="doc_uri",  # The URI of the chunk - displayed as the document ID in the Review App
#         additional_metadata_columns=[],  # Additional columns to return from the vector database and present to the LLM
#     ),
#     # Parameters defined by Vector Search docs: https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#query-a-vector-search-endpoint
#     vector_search_parameters=VectorSearchParameters(
#         num_results=5,  # Number of search results that the retriever returns
#         query_type="ann",  # Type of search: ann or hybrid
#     ),
#     doc_similarity_threshold=0.0,  # 0 to 1, similarity threshold cut off for retrieved docs.  Increase if the retriever is returning irrelevant content.
    
# )

########################
#### ✅✏️ LLM configuration
########################

llm_config = LLMConfig(
    llm_endpoint_name="ep-gpt4o-new",  # Model serving endpoint
    llm_system_prompt_template=(
        """You are a helpful assistant that answers questions by calling tools.  Provide responses ONLY based on the outputs from tools.  If you do not have a relevant tool for a question, respond with 'Sorry, I'm not trained to answer that question'."""
    ),  # System prompt template
    llm_parameters=LLMParametersConfig(
        temperature=0.01, max_tokens=1500
    ),  # LLM parameters
)

agent_config = FunctionCallingAgentConfig(
    llm_config=llm_config,
    tools=[
        retriever_tool, 
        translate_sku_tool,
        doc_tool
    ],
    input_example={
        "messages": [
            {
                "role": "user",
                "content": "What is RAG?",
            },
        ]
    },
)


########################
##### 🚫✏️ Dump the configuration to a YAML
########################


# We dump the Pydantic model to a YAML file because:
# 1. MLflow ModelConfig only accepts YAML files or dictionaries
# 2. When importing the Agent's code, it needs to read this configuration
def write_dict_to_yaml(data, file_path):
    with open(file_path, "w") as file:
        yaml.dump(data, file, default_flow_style=False)


with open("./configs/agent_model_config.yaml", "w") as handle:
    agent_config_yml = agent_config.to_yaml()
    handle.write(agent_config_yml)

########################
#### Print resulting config to the console
########################
print(json.dumps(agent_config.model_dump(), indent=4))

{
    "tools": [
        {
            "class_path": "utils.agents.vector_search.VectorSearchRetrieverTool",
            "description": "Use this tool to find customer info",
            "doc_similarity_threshold": 0.0,
            "filterable_columns": [],
            "name": "search_customer_info",
            "retriever_filter_parameter_prompt": "optional filters to apply to the search. An array of objects, each specifying a field name and the filters to apply to that field.",
            "retriever_query_parameter_prompt": "query to look up in retriever",
            "vector_search_index": "ep.cookbook_local_test.my_pdfs_docs_chunked_index__v1",
            "vector_search_parameters": {
                "num_results": 5,
                "query_type": "ann"
            },
            "vector_search_schema": {
                "additional_metadata_columns": [
                    "parser_status"
                ],
                "chunk_text": "content_chunked",
                "documen

In [1]:
from cookbook.config.base import load_serializable_config_from_yaml

with open("./configs/agent_model_config.yaml", "r") as handle:
    yaml_str = handle.read()



test = load_serializable_config_from_yaml(yaml_str)
test.model_dump()
# type(test)
# test

{'tools': [{'class_path': 'utils.agents.vector_search.VectorSearchRetrieverTool',
   'description': 'Use this tool to find customer info',
   'doc_similarity_threshold': 0.0,
   'filterable_columns': [],
   'name': 'search_customer_info',
   'retriever_filter_parameter_prompt': 'optional filters to apply to the search. An array of objects, each specifying a field name and the filters to apply to that field.',
   'retriever_query_parameter_prompt': 'query to look up in retriever',
   'vector_search_index': 'ep.cookbook_local_test.my_pdfs_docs_chunked_index__v1',
   'vector_search_parameters': {'num_results': 5, 'query_type': 'ann'},
   'vector_search_schema': {'additional_metadata_columns': ['parser_status'],
    'chunk_text': 'content_chunked',
    'document_uri': 'doc_uri'}},
  {'class_path': 'utils.agents.uc_tool.UCTool',
   'error_prompt': 'Error in generated code.  Please think step-by-step about how to fix the error and try calling this tool again with corrected inputs that reflec

load_obj_from_yaml --> the only way a class is loaded, will get the class path key

* load & install the class from the class path key
* call that class's _from_dict method with the remaining data to let it do anything custom e.g,. load the tools

model_dump --> the only way the class is dumped, includes the class path key

In [1]:
from cookbook.config.agents.function_calling_agent import FunctionCallingAgentConfig

FunctionCallingAgentConfig.from_dict(FunctionCallingAgentConfig, {})

AttributeError: from_dict

In [6]:

# import importlib

# class_path = "utils.agents.function_calling_agent.FunctionCallingAgentConfig"

# module_name, class_name = class_path.rsplit(".", 1)
# # First we import the module
# module = importlib.import_module(module_name)  # imports datetime module
# # Then we get the class
# test = getattr(module, class_name)   # gets datetime.datetime class

# # Now datetime_class is the actual datetime class and can be used to create instances
# # instance = datetime_class.now()
# new = test.from_dict()

AttributeError: from_dict

In [2]:
from cookbook.config.base import load_serializable_config_from_yaml

with open("./configs/agent_model_config.yaml", "r") as handle:
    yaml_str = handle.read()

test = load_serializable_config_from_yaml(yaml_str)
print("---")
print(test)
print("---")

<class 'utils.agents.function_calling_agent.FunctionCallingAgentConfig'>
<class 'pydantic._internal._model_construction.ModelMetaclass'>
{'input_example': {'messages': [{'content': 'What is RAG?', 'role': 'user'}]}, 'llm_config': {'llm_endpoint_name': 'ep-gpt4o-new', 'llm_parameters': {'max_tokens': 1500, 'temperature': 0.01}, 'llm_system_prompt_template': "You are a helpful assistant that answers questions by calling tools.  Provide responses ONLY based on the outputs from tools.  If you do not have a relevant tool for a question, respond with 'Sorry, I'm not trained to answer that question'."}, 'tools': [{'class_path': 'utils.agents.vector_search.VectorSearchRetrieverTool', 'description': 'Use this tool to find customer info', 'doc_similarity_threshold': 0.0, 'filterable_columns': [], 'name': 'search_customer_info', 'retriever_filter_parameter_prompt': 'optional filters to apply to the search. An array of objects, each specifying a field name and the filters to apply to that field.',

things to solve for 
- load agent from known class
- load config from known class --> solved by SerializbleModel
- OR 
- load from model serving endpoint


have agent code in a notebook - not possible

local dev loop without endpoint

deploy multi agent
- deploy each supervised agent
- get the model serving endpoint for each
- update the multi agent config with the model serving endpoints
- deploy the multi agent


In [9]:
retriever_tool.__class__.__module__

import sys
sys.modules[retriever_tool.__class__.__module__]

<module 'utils.agents.vector_search' from '/Users/eric.peter/Github/genai-cookbook/agent_app_sample_code/utils/agents/vector_search.py'>

In [34]:
genie_agent.model_dump()

{'genie_space_id': '01ef92e3b5631f0da85834290964831d',
 'input_example': {'messages': [{'role': 'user',
    'content': 'What data can you query?'}]}}

In [2]:
import logging
logging.getLogger().setLevel(logging.INFO)


In [6]:
from cookbook.agents.common.load_config import TMP_CONFIG_FILE_NAME

globals()[TMP_CONFIG_FILE_NAME] ="dsfsd"
globals().get(
                TMP_CONFIG_FILE_NAME, None
            )

'dsfsd'

In [1]:
from mlflow.models.rag_signatures import StringResponse, ChatCompletionRequest
from mlflow.models.signature import ModelSignature
# from agents.genie_agent.genie_agent import GenieAgentConfig
from cookbook.config.base import _CLASS_PATH_KEY, SerializableConfig
from  cookbook.config.agents.genie_agent import GenieAgentConfig
from cookbook.config.base import serializable_config_to_yaml_file
import mlflow
import time
import os
from cookbook.agents.common.load_config import TMP_CONFIG_FILE_NAME

def log_agent_to_mlflow(agent_config: SerializableConfig, agent_code_path: str):
    # Generate unique config filename that the agent's code will read when being logged by MLflow
    # tmp_config_file = f"./configs/tmp/agent_config__{agent_config.__class__.__name__}__{int(time.time())}.yaml"
    # print(tmp_config_file)
    # globals()[TMP_CONFIG_FILE_NAME] =tmp_config_file
    # print(globals()[TMP_CONFIG_FILE_NAME_ENV_VAR_NAME])

    # Dump the config to that YAML file
    # obj_to_yaml_file(agent_config, tmp_config_file)
    with mlflow.start_run():
        test = mlflow.pyfunc.log_model(
                artifact_path="agent",
                python_model=agent_code_path,
                input_example=agent_config.input_example,
                model_config=agent_config.model_dump(),
                # resources=resource_dependencies,
                signature=ModelSignature(
                inputs=ChatCompletionRequest(),
                outputs=StringResponse(), # TODO: Add in `messages` to signature
            ),
            code_paths=[os.path.join(os.getcwd(), "utils")],
        )
    return test




genie_agent = GenieAgentConfig(genie_space_id="01ef92e3b5631f0da85834290964831d", input_example={
        "messages": [
            {
                "role": "user",
                "content": "What data can you query?",
            },
        ]
    },)

# obj_to_yaml_file(genie_agent, "./configs/genie_config.yaml")

test = log_agent_to_mlflow(genie_agent, "agents/genie_agent/genie_agent.py")


  from .autonotebook import tqdm as notebook_tqdm
Uploading artifacts: 100%|██████████| 41/41 [00:03<00:00, 11.98it/s]
Downloading artifacts: 100%|██████████| 41/41 [00:01<00:00, 31.05it/s] 
2024/11/05 21:12:46 INFO mlflow.tracking._tracking_service.client: 🏃 View run nervous-mule-831 at: https://e2-dogfood.staging.cloud.databricks.com/ml/experiments/3916415516852775/runs/a9b76a3936144c36be3aab6f58f24221.
2024/11/05 21:12:46 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: https://e2-dogfood.staging.cloud.databricks.com/ml/experiments/3916415516852775.


In [3]:
import mlflow
model = "runs:/a9b76a3936144c36be3aab6f58f24221/agent"
d = mlflow.pyfunc.load_model(model)

  from .autonotebook import tqdm as notebook_tqdm
Downloading artifacts: 100%|██████████| 41/41 [00:02<00:00, 18.79it/s]  


In [53]:
data = {'genie_space_id': '01ef92e3b5631f0da85834290964831d', 'input_example': {'messages': [{'content': 'What data can you query?', 'role': 'user'}]}}

GenieAgentConfig(**data)

GenieAgentConfig(genie_space_id='01ef92e3b5631f0da85834290964831d', input_example={'messages': [{'content': 'What data can you query?', 'role': 'user'}]})

In [6]:
from databricks.sdk import WorkspaceClient

ws = WorkspaceClient()
# ws.get_model_version_download_uri(model, "production")


In [7]:
d.predict({
        "messages": [
            {
                "role": "user",
                "content": "What data can you query?",
            },
        ]
    })

{'content': 'I can query data from the following tables within the `ep`.`agent_demo` schema:\n\n1. `churn_app_events`: Records user interactions with a mobile app, including user ID, event ID, platform, date, action, session ID, and URL.\n2. `churn_orders`: Contains information about orders made by users, including order amount, order ID, user ID, item count, and creation date.\n3. `churn_users`: Contains information about all users, including demographics, user activity, and churn status.\n4. `churn_prediction`: Contains information about user churn prediction, including demographic and behavioral data, and various user attributes.\n\nThese tables can be used to analyze user behavior, identify trends, understand user demographics, and predict user churn.',
 'messages': [{'role': 'user', 'content': 'What data can you query?'},
  {'role': 'assistant',
   'tool_calls': [{'id': 'call_c57288cd80d44ccc8a20fc7e168e814e',
     'function': {'arguments': '{"query": "What data can you query?"}',

In [4]:
retriever_tool.get_json_schema()

{'type': 'function',
 'function': {'name': 'search_product_docs',
  'description': 'Use this tool to search for product documentation.',
  'parameters': {'properties': {'query': {'description': 'query to look up in retriever',
     'type': 'string'},
    'type': 'object',
    'required': ['query'],
    'additionalProperties': False}}}}

In [5]:
%load_ext autoreload
%autoreload 2


In [11]:
from databricks import agents

print(agents.deploy.__doc__)


    Deploy new version of the agents.

    :param model_name: Name of UC registered model
    :param model_version: Model version #
    :param scale_to_zero: Flag to scale the endpoint to zero when not in use. With scale to zero,
    the compute resources may take time to come up so the app may not be ready instantly. (default: False)
    :param environment_vars: Dictionary of environment variables used to provide configuration for the endpoint (default: {})
    :param instance_profile_arn: Instance profile ARN to use for the endpoint (default: None)
    :param tags: Dictionary of tags to attach to the deployment (default: None)

    :return: Chain deployment metadata.
    


In [14]:
from agents.genie_agent.genie_agent import GenieAgent

GenieAgent

agents.genie_agent.genie_agent.GenieAgent

Design for multi-agent

requirements
* can test locally with just the agent's pyfunc classes
* when you change any config, it all just reloads

when you deploy:
* you  deploy each supervised agent separately to model serving
* then mutli agent picks these up 
* then mutli agent deploys

* each child agent has [name, description, config, code]
 - when deployed, it reads it from the UC
 - locally, from the config

In [4]:
from cookbook.config.agents.multi_agent import MultiAgentSupervisorConfig, SupervisedAgentConfig
from cookbook.config.common.llm import LLMConfig, LLMParametersConfig
from cookbook.config.base import serializable_config_to_yaml_file

# Create the supervisor config
supervisor_config = MultiAgentSupervisorConfig(
    llm_endpoint_name="ep-gpt4o-new",
    llm_parameters=LLMParametersConfig(
            temperature=0.01,
            max_tokens=1500
        ),
    # input_example={
    #     "messages": [
    #         {
    #             "role": "user", 
    #             "content": "What can you help me with?"
    #         }
    #     ]
    # },
    playground_debug_mode=True,
    # agent_loading_mode="local",
    max_workers_called=5,
    agents=[
        SupervisedAgentConfig(
            description="Has access to the product documentation, transcripts from our customer service call center and information about customer's recent orders.",
            name="CustomerServiceTranscripts",
            endpoint_name="agents_ep-agent_demo-customer_bot_function_calling_agent"
        ),
        SupervisedAgentConfig(
            description="Has access to structured data about our customers, their orders, their activity on our application, and their likelihood to churn.",
            name="CustomerData",
            endpoint_name="agents_ep-agent_demo-customer_bot_genie_agent"
        )
    ]
)
supervisor_config.model_dump()

serializable_config_to_yaml_file(supervisor_config, "./configs/multi_agent_supervisor_config.yaml")

In [1]:
from cookbook.config.base import load_serializable_config_from_yaml

with open("./configs/multi_agent_supervisor_config.yaml", "r") as handle:
    yaml_str = handle.read()



test = load_serializable_config_from_yaml(yaml_str)
test.model_dump()
# type(test)
# test

{'llm_endpoint_name': 'ep-gpt4o-new',
 'llm_parameters': {'temperature': 0.01, 'max_tokens': 1500},
 'input_example': {'messages': [{'content': 'What can you help me with?',
    'role': 'user'}]},
 'playground_debug_mode': True,
 'agent_loading_mode': 'local',
 'max_workers_called': 5,
 'supervisor_system_prompt': '## Role\nYou are a supervisor responsible for managing a conversation between a user and the following workers.  You select the next worker to respond or end the conversation to return the last worker\'s response to the user.  Use the {ROUTING_FUNCTION_NAME} function to share your step-by-step reasoning and decision.\n\n## Workers\n<workers>{workers_names_and_descriptions}</workers>\n\n## Objective\nYour goal is to facilitate the conversation and ensure the user receives a helpful response.\n\n## Instructions\n1. **Review the Conversation History**: Think step by step by to understand the user\'s request and the conversation history which includes previous worker\'s response

In [1]:
from cookbook.config.agents.genie_agent import GenieAgentConfig
# from agents.genie_agent.genie_agent import GenieAgent

genie_agent = GenieAgentConfig(genie_space_id="01ef92e3b5631f0da85834290964831d", input_example={
        "messages": [
            {
                "role": "user",
                "content": "What data can you query?",
            },
        ]
    },)

genie_agent
with open("./configs/genie_config.yaml", "w") as handle:
    genie_config_yml = genie_agent.to_yaml()
    handle.write(genie_config_yml)



In [1]:
from cookbook.config.base import load_serializable_config_from_yaml

with open("./configs/genie_config.yaml", "r") as handle:
    yaml_str = handle.read()



test = load_serializable_config_from_yaml(yaml_str)
test.model_dump()
# type(test)
# test

{'genie_space_id': '01ef92e3b5631f0da85834290964831d',
 'input_example': {'messages': [{'content': 'What data can you query?',
    'role': 'user'}]},
 'encountered_error_user_message': 'I encountered an error trying to answer your question, please try again.',
 'class_path': 'utils.agents.genie_agent.GenieAgentConfig'}

In [2]:
from cookbook.config.agents.multi_agent import MultiAgentSupervisorConfig
from cookbook.config.common.llm import LLMConfig

multi_agent_config = MultiAgentSupervisorConfig(
    llm_config=LLMConfig(
        llm_endpoint_name="ep-gpt4o-new",
        llm_parameters={
            "max_tokens": 1500,
            "temperature": 0.01
        },
        llm_system_prompt_template="You are a helpful assistant that answers questions by calling tools. Provide responses ONLY based on the outputs from tools. If you do not have a relevant tool for a question, respond with 'Sorry, I'm not trained to answer that question'."
    ),
    input_example={
        "messages": [
            {
                "role": "user", 
                "content": "What data can you query?"
            }
        ]
    },
    playground_debug_mode=False,
    agent_loading_mode="local",
    agents=[genie_agent]
)

with open("./configs/multi_agent_config.yaml", "w") as handle:
    genie_config_yml = multi_agent_config.to_yaml()
    handle.write(genie_config_yml)


In [1]:
from cookbook.config.base import load_serializable_config_from_yaml

with open("./configs/multi_agent_config.yaml", "r") as handle:
    yaml_str = handle.read()



test = load_serializable_config_from_yaml(yaml_str)
test.model_dump()
# type(test)
# test

AttributeError: module 'utils.agents.multi_agent' has no attribute 'MultiAgentConfig'

In [11]:
from cookbook.config.base import load_serializable_config_from_yaml
genie_agent.model_dump()

test = load_serializable_config_from_yaml(genie_agent.to_yaml())

In [12]:
test

GenieAgentConfig(genie_space_id='01ef92e3b5631f0da85834290964831d', input_example={'messages': [{'content': 'What data can you query?', 'role': 'user'}]}, encountered_error_user_message='I encountered an error trying to answer your question, please try again.')

#### ✅✏️ Adjust the Agent's code

Here, we import the Agent's code so we can run the Agent locally within the notebook.  To modify the code, open this Notebook in a separate window, make your changes, and re-run this cell.

**Important: Typically, when building the first version of your agent, you will not need to modify the code.**

In [0]:
%run ./agents/function_calling_agent/function_calling_agent_mlflow_sdk

## 2️⃣ Evaluate the Agent's quality

Once you have modified the code & config to create a version of your Agent, there are 3 ways you can test it's quality.

Each mode does the same high level steps:
1. Log the Agent's code and config to an MLflow Run --> this captures your code/config should you need to return to this version later after modifying the notebook
2. Runs the Agent for 1+ queries
3. Allows you to inspect the Agent's outputs for those queries

To get started, pick a mode and scroll to the relevant cells below.  If this is your first time, start with 🅰.

- 🅰 Vibe check the Agent for a single query
  - Use this mode in your inner dev loop to iterate and debug on a single query while making a change.
- 🅱 Evaluate the Agent for 1+ queries
  - Use this mode before you have an evaluation set defined, but want to test a version of the Agent against multiple queries to ensure your change doesn't cause a regression for other queries.
- 🅲 Evaluate the Agent using your evaluation set
  - Use this mode once you have an evaluation set defined.  It is the same as 🅱, but uses your evaluation set that is stored in a Delta Table.



###### 🚫✏️ Helper function to log the Agent to MLflow

This helper function wraps the code required to log a version of the Agent's code & config to MLflow.  It is used by all 3 modes.


#### ✅✏️ 🅰 Vibe check the Agent for a single query

Running this cell will produce an MLflow Trace that you can use to see the Agent's outputs and understand the steps it took to produce that output.

In [0]:
# Query
vibe_check_query = {
    "messages": [
        {"role": "user", "content": f"what is lakehouse monitoring?"},
    ]
}


def log_agent_to_mlflow():
    resource_dependencies = get_agent_dependencies(agent_config=agent_config)
    return log_pyfunc_agent(
        resource_dependencies=resource_dependencies,
        agent_definition_file_path="agents/function_calling_agent/function_calling_agent_mlflow_sdk",
        input_example=agent_config.input_example,
    )


# `run_name` provides a human-readable name for this vibe check in the MLflow experiment
with mlflow.start_run(
    run_name="vibe-check__" + datetime.now().strftime("%Y-%m-%d_%I:%M:%S_%p")
):
    # Log the current Agent code/config to MLflow
    logged_agent_info = log_agent_to_mlflow()

    # Execute the Agent
    agent = FunctionCallingAgent(agent_config=agent_config)

    # Run the agent for this query
    response = agent.predict(model_input=vibe_check_query)

    # Print Agent's output
    display_markdown(f"### Agent's output:\n{response['content']}", raw=True)

#### ✅✏️ 🅱 Evaluate the Agent for 1+ queries

Running this cell will call [Agent Evaluation](https://docs.databricks.com/en/generative-ai/agent-evaluation/index.html) which will run your Agent to generate outputs for 1+ queries and then evaluate the Agent's quality and assess the root cause of any quality issues. The resulting outputs, MLflow Trace, and evaluation results are available in the MLflow Run.

In [0]:
import pandas as pd

evaluation_set = [
    {  # query 1
        "request": {
            "messages": [
                {"role": "user", "content": f"what is lakehouse monitoring?"},
            ]
        }
    },
    {  # query 2
        "request": {
            "messages": [
                {"role": "user", "content": f"what is rag?"},
            ]
        }
    },
    # add more queries here
]

# `run_name` provides a human-readable name for this vibe check in the MLflow experiment
with mlflow.start_run(
    run_name="vibe-check__" + datetime.now().strftime("%Y-%m-%d_%I:%M:%S_%p")
):
    # Log the current Agent code/config to MLflow
    logged_agent_info = log_agent_to_mlflow()

    # Run the agent for these queries, using Agent evaluation to parallelize the calls
    eval_results = mlflow.evaluate(
        model=logged_agent_info.model_uri,  # use the logged Agent
        data=pd.DataFrame(
            evaluation_set
        ),  # Run the logged Agent for all queries defined above
        model_type="databricks-agent",  # use Agent Evaluation
    )

    # Show all outputs.  Click on a row in this table to display the MLflow Trace.
    display(eval_results.tables["eval_results"])

    # Click 'View Evaluation Results' to see the Agent's inputs/outputs + quality evaluation displayed in a UI

#### ✅✏️ 🅲 Evaluate the Agent using your evaluation set

Note: If this is your first time creating this agent, this cell will not work.  The evaluation set is populated in the next notebooks using stakeholder feedback.

In [0]:
# Load the evaluation set from Delta Table
evaluation_set = spark.table(cookbook_shared_config.evaluation_set_table).toPandas()

# `run_name` provides a human-readable name for this vibe check in the MLflow experiment
with mlflow.start_run(
    run_name="evaluation__" + datetime.now().strftime("%Y-%m-%d_%I:%M:%S_%p")
):
    # Log the current Agent code/config to MLflow
    logged_agent_info = log_agent_to_mlflow()

    # Run the agent for these queries, using Agent evaluation to parallelize the calls
    eval_results = mlflow.evaluate(
        model=logged_agent_info.model_uri,  # use the logged Agent
        data=evaluation_set,  # Run the logged Agent for all queries defined above
        model_type="databricks-agent",  # use Agent Evaluation
    )

    # Show all outputs.  Click on a row in this table to display the MLflow Trace.
    display(eval_results.tables["eval_results"])

    # Click 'View Evaluation Results' to see the Agent's inputs/outputs + quality evaluation displayed in a UI

## 3️⃣ Register a version of the Agent to Unity Catalog

Once you have a version of your Agent that has sufficient quality, you will register the Agent's model from the MLflow Experiment into the Unity Catalog.  This allows you to use the next notebooks to deploy the Agent to Agent Evaluation's Review App to share it with stakeholders & collect feedback.

You can register a version in two ways:
1. Register an Agent version that you logged above
2. Log the latest version of the Agent and then register it

##### ✅✏️ Option 1. Register an Agent that you logged above

1. Set the MLflow model's URI in the below cell by either
  - *(Suggested)* If you haven't modified the above code, the `model_uri` from the last logged Agent is stored in the local variable `logged_agent_info.model_uri`.
  - If you want to register a different version:
    - Go the MLflow experiment UI, click on the Run containing the Agent version, and find the Run ID in the Overview tab --> Details section.
    - Your `model_uri` is `runs:/b5b9436a56544263a97ddd2293e6f422/agent` where `b5b9436a56544263a97ddd2293e6f422` is the Run ID.

2. Run the below cell to register the model to Unity Catalog
3. Note the version of the Unity Catalog model - you will need this in the next notebook to deploy this Agent.

In [0]:
# Enter the model_uri of the Agent version to be registered
model_uri_to_register = logged_agent_info.model_uri  # last Agent logged

# Register a different Agent version
# model_uri_to_register = "runs:/run_id_goes_here/agent" # pick an Agent version

# Use Unity Catalog as the model registry
mlflow.set_registry_uri("databricks-uc")

# Register the Agent's model to the Unity Catalog
uc_registered_model_info = mlflow.register_model(
    model_uri=model_uri_to_register, name=cookbook_shared_config.uc_model
)

# Print the version number
display_markdown(
    f"### Unity Catalog model version: **{uc_registered_model_info.version}**", raw=True
)

##### ✅✏️ Option 2. Log the latest version of the Agent and register it.

1. Optionally, give the version a short name in `agent_version_short_name` so you can easily identify it in the MLflow experiment later
1. Run the below cell to log the Agent to a model inside an MLflow Run & register that model to Unity Catalog
2. Note the version of the Unity Catalog model - you will need this in the next notebook to deploy this Agent.

In [0]:
agent_version_short_name = "friendly-name-to-identify-this-version"  # set to None if you want MLflow to generate a name e.g., `aged-perch-556`

with mlflow.start_run(run_name=agent_version_short_name):
    # Log the current Agent code/config to MLflow
    logged_agent_info = log_agent_to_mlflow()

    # Use Unity Catalog as the model registry
    mlflow.set_registry_uri("databricks-uc")

    # Register this model to the Unity Catalog
    uc_registered_model_info = mlflow.register_model(
        model_uri=logged_agent_info.model_uri, name=cookbook_shared_config.uc_model
    )

    # Print the version number
    display_markdown(
        f"### Unity Catalog model version: **{uc_registered_model_info.version}**",
        raw=True,
    )