## 👉 START HERE: How to use this notebook

### Step 1: Agent storage configuration

This notebook initializes a `AgentStorageConfig` Pydantic class to define the locations where the Agent's code/config and its supporting data & metadata is stored in the Unity Catalog:
- **Unity Catalog Model:** Stores staging/production versions of the Agent's code/config
- **MLflow Experiment:** Stores every development version of the Agent's code/config, each version's associated quality/cost/latency evaluation results, and any MLflow Traces from your development & evaluation processes
- **Evaluation Set Delta Table:** Stores the Agent's evaluation set

This notebook does the following:
1. Validates the provided locations exist.
2. Serializes this configuration to `config/agent_storage_config.yaml` so other notebooks can use it

**Important note:** Throughout this notebook, we indicate which cells you:
- ✅✏️ *should* customize - these cells contain config settings to change
- 🚫✏️ *typically will not* customize - these cells contain boilerplate code required to validate / save the configuration

*Cells that don't require customization still need to be run!*

### 🚫✏️ Install Python libraries

In [0]:
%pip install -qqqq -U -r requirements.txt
%restart_python

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain 0.1.20 requires langchain-core<0.2.0,>=0.1.52, but you have langchain-core 0.3.24 which is incompatible.
langchain 0.1.20 requires langsmith<0.2.0,>=0.1.17, but you have langsmith 0.2.2 which is incompatible.
langchain-community 0.0.38 requires langchain-core<0.2.0,>=0.1.52, but you have langchain-core 0.3.24 which is incompatible.
langchain-community 0.0.38 requires langsmith<0.2.0,>=0.1.0, but you have langsmith 0.2.2 which is incompatible.
langchain-text-splitters 0.0.2 requires langchain-core<0.3,>=0.1.28, but you have langchain-core 0.3.24 which is incompatible.
ydata-profiling 4.5.1 requires pandas!=1.4.0,<2.1,>1.1, but you have pandas 2.2.3 which is incompatible.
ydata-profiling 4.5.1 requires pydantic<2,>=1.8.1, but you have pydantic 2.10.3 which is incompatible.[0m[31m
[0m[43mNote: you 

### 🚫✏️ Connect to Databricks

If running locally in an IDE using Databricks Connect, connect the Spark client & configure MLflow to use Databricks Managed MLflow.  If this running in a Databricks Notebook, these values are already set.

In [0]:
from mlflow.utils import databricks_utils as du
import os
if not du.is_in_databricks_notebook():
    from databricks.connect import DatabricksSession

    spark = DatabricksSession.builder.getOrCreate()
    os.environ["MLFLOW_TRACKING_URI"] = "databricks"

### 🚫✏️ Get current user info to set default values

In [0]:
from cookbook.databricks_utils import get_current_user_info

user_email, user_name, default_catalog = get_current_user_info(spark)

print(f"User email: {user_email}")
print(f"User name: {user_name}")
print(f"Default UC catalog: {default_catalog}")

User email: manffred.calvosanchez@databricks.com
User name: manffred_calvosanchez
Default UC catalog: sunny_uc


### ✅✏️ Configure your Agent's storage locations

Either review & accept the default values or enter your preferred location.

In [0]:
from cookbook.config.shared.agent_storage_location import AgentStorageConfig
from cookbook.databricks_utils import get_mlflow_experiment_url
import mlflow

# Default values below for `AgentStorageConfig` 
agent_name = "my_agent_autogen"
uc_catalog_name = "casaman_ssa"
uc_schema_name = "demos"

# Agent storage configuration
agent_storage_config = AgentStorageConfig(
    uc_model_name=f"{uc_catalog_name}.{uc_schema_name}.{agent_name}",  # UC model to store staging/production versions of the Agent's code/config
    evaluation_set_uc_table=f"{uc_catalog_name}.{uc_schema_name}.{agent_name}_eval_set",  # UC table to store the evaluation set
    mlflow_experiment_name=f"/Users/{user_email}/{agent_name}_mlflow_experiment",  # MLflow Experiment to store development versions of the Agent and their associated quality/cost/latency evaluation results + MLflow Traces
)

# Validate the UC catalog and schema for the Agent'smodel & evaluation table
is_valid, msg = agent_storage_config.validate_catalog_and_schema()
if not is_valid:
    raise Exception(msg)

# Set the MLflow experiment, validating the path is valid
experiment_info = mlflow.set_experiment(agent_storage_config.mlflow_experiment_name)
# If running in a local IDE, set the MLflow experiment name as an environment variable
os.environ["MLFLOW_EXPERIMENT_NAME"] = agent_storage_config.mlflow_experiment_name

print(f"View the MLflow Experiment `{agent_storage_config.mlflow_experiment_name}` at {get_mlflow_experiment_url(experiment_info.experiment_id)}")

All catalogs and schemas exist for both model `casaman_ssa.demos.my_agent_autogen` and evaluation table `casaman_ssa.demos.my_agent_autogen_eval_set`.


2024/12/11 18:01:36 INFO mlflow.tracking.fluent: Experiment with name '/Users/manffred.calvosanchez@databricks.com/my_agent_autogen_mlflow_experiment' does not exist. Creating a new experiment.


View the MLflow Experiment `/Users/manffred.calvosanchez@databricks.com/my_agent_autogen_mlflow_experiment` at https://adb-984752964297111.11.azuredatabricks.net/ml/experiments/2822477370659093


### 🚫✏️ Save the configuration for use by other notebooks

In [0]:
from cookbook.config import serializable_config_to_yaml_file

serializable_config_to_yaml_file(agent_storage_config, "./configs/agent_storage_config.yaml")