# Tool-calling Agent

This is an auto-generated notebook created by an AI playground export. In this notebook, you will:
- Author a tool-calling [MLflow's `ResponsesAgent`](https://mlflow.org/docs/latest/api_reference/python_api/mlflow.pyfunc.html#mlflow.pyfunc.ResponsesAgent) that uses the OpenAI client
- Manually test the agent's output
- Evaluate the agent with Mosaic AI Agent Evaluation
- Log and deploy the agent

This notebook should be run on serverless or a cluster with DBR<17.

 **_NOTE:_**  This notebook uses the OpenAI SDK, but AI Agent Framework is compatible with any agent authoring framework, including LlamaIndex or LangGraph. To learn more, see the [Authoring Agents](https://docs.databricks.com/generative-ai/agent-framework/author-agent) Databricks documentation.

## Prerequisites

- Address all `TODO`s in this notebook.

In [0]:
%pip install -U -qqqq backoff databricks-openai uv databricks-agents mlflow-skinny[databricks]
dbutils.library.restartPython()

## Define the agent in code
Below we define our agent code in a single cell, enabling us to easily write it to a local Python file for subsequent logging and deployment using the `%%writefile` magic command.

For more examples of tools to add to your agent, see [docs](https://docs.databricks.com/generative-ai/agent-framework/agent-tool.html).

In [0]:
%%writefile agent.py
import json
from typing import Any, Callable, Generator, Optional
from uuid import uuid4
import warnings

import backoff
import mlflow
import openai
from databricks.sdk import WorkspaceClient
from databricks_openai import UCFunctionToolkit, VectorSearchRetrieverTool
from mlflow.entities import SpanType
from mlflow.pyfunc import ResponsesAgent
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
    output_to_responses_items_stream,
    to_chat_completions_input,
)
from openai import OpenAI
from pydantic import BaseModel
from unitycatalog.ai.core.base import get_uc_function_client

############################################
# Define your LLM endpoint and system prompt
############################################
# LLM_ENDPOINT_NAME = "databricks-claude-opus-4-5"
LLM_ENDPOINT_NAME = "databricks-claude-sonnet-4"


SYSTEM_PROMPT = """You are a Fashion Retail Customer Insights Analyst specializing in voice-of-customer analysis for a retail intelligence platform.

## Your Role
Help business users understand customer sentiment, product feedback, and experience patterns by analyzing customer reviews from our fashion retail database.

## Data Context
You have access to customer reviews including:
- **Product Reviews** (60%): Quality, fit, material, sizing, style feedback
- **Purchase Experience** (20%): Shipping, packaging, customer service
- **Return Feedback** (20%): Return reasons and process experience

Reviews include metadata such as:
- Product categories: apparel, footwear, accessories
- Customer segments: VIP, Premium, Loyal, Regular, New
- Brands: Luxe Label, Bold Basics, Eco Threads, Modern Minimal, Vintage Vibes, Street Wear Co, Urban Style, Classic Comfort
- Rating (1-5 stars) and sentiment scores

## Response Guidelines
1. **Always use the review search tool** to find relevant customer feedback before answering questions about products, brands, categories, or customer sentiment.
2. **Synthesize patterns** from multiple reviews rather than quoting individual reviews verbatim.
3. **Structure your response** with:
   - **Key Findings**: 2-3 main insights from the reviews
   - **Themes**: Common positive/negative patterns
   - **Recommendations**: Actionable suggestions based on customer feedback
4. **Be specific** about volume when possible (e.g., "across 15 reviews mentioning sizing...")
5. **Acknowledge limitations** if few reviews match the query.

## Example Queries You Can Answer
- "What are customers saying about footwear sizing?"
- "What quality issues are VIP customers reporting?"
- "How do customers feel about Eco Threads products?"
- "What are the main complaints about accessories?"
- "What do reviews say about leather products?"
"""


###############################################################################
## Define tools for your agent, enabling it to retrieve data or take actions
## beyond text generation
## To create and see usage examples of more tools, see
## https://docs.databricks.com/generative-ai/agent-framework/agent-tool.html
###############################################################################
class ToolInfo(BaseModel):
    """
    Class representing a tool for the agent.
    - "name" (str): The name of the tool.
    - "spec" (dict): JSON description of the tool (matches OpenAI Responses format)
    - "exec_fn" (Callable): Function that implements the tool logic
    """

    name: str
    spec: dict
    exec_fn: Callable


def create_tool_info(tool_spec, exec_fn_param: Optional[Callable] = None):
    tool_spec["function"].pop("strict", None)
    tool_name = tool_spec["function"]["name"]
    udf_name = tool_name.replace("__", ".")

    # Define a wrapper that accepts kwargs for the UC tool call,
    # then passes them to the UC tool execution client
    def exec_fn(**kwargs):
        function_result = uc_function_client.execute_function(udf_name, kwargs)
        if function_result.error is not None:
            return function_result.error
        else:
            return function_result.value
    return ToolInfo(name=tool_name, spec=tool_spec, exec_fn=exec_fn_param or exec_fn)


TOOL_INFOS = []

# You can use UDFs in Unity Catalog as agent tools
# TODO: Add additional tools
UC_TOOL_NAMES = []

uc_toolkit = UCFunctionToolkit(function_names=UC_TOOL_NAMES)
uc_function_client = get_uc_function_client()
for tool_spec in uc_toolkit.tools:
    TOOL_INFOS.append(create_tool_info(tool_spec))


# Use Databricks vector search indexes as tools
# See [docs](https://docs.databricks.com/generative-ai/agent-framework/unstructured-retrieval-tools.html) for details

# Use Databricks vector search indexes as tools
# See the [Databricks Documentation](https://docs.databricks.com/generative-ai/agent-framework/unstructured-retrieval-tools.html) for details
VECTOR_SEARCH_TOOLS = []
VECTOR_SEARCH_TOOLS.append(
    VectorSearchRetrieverTool(
        index_name="juan_use1_catalog.retail.gold_customer_reviews_idx",
        tool_description="""Search fashion retail customer reviews using semantic similarity.

Use this tool when users ask about:
- Customer feedback on products, brands, or categories
- Sizing issues, quality concerns, or comfort feedback
- Customer sentiment about specific product types
- Common complaints or praise patterns
- Return reasons or purchase experiences

The index contains ~5,000 reviews with filterable attributes:
- product_category: apparel, footwear, accessories
- product_brand: Luxe Label, Bold Basics, Eco Threads, Modern Minimal, Vintage Vibes, Street Wear Co, Urban Style, Classic Comfort
- customer_segment: vip, premium, loyal, regular, new
- rating: 1-5 star ratings

Search queries should be natural language descriptions of the feedback pattern to find, e.g., "shoes run small", "quality issues with leather", "comfortable for walking"."""
    )
)
for vs_tool in VECTOR_SEARCH_TOOLS:
    TOOL_INFOS.append(create_tool_info(vs_tool.tool, vs_tool.execute))



class ToolCallingAgent(ResponsesAgent):
    """
    Class representing a tool-calling Agent
    """

    def __init__(self, llm_endpoint: str, tools: list[ToolInfo]):
        """Initializes the ToolCallingAgent with tools."""
        self.llm_endpoint = llm_endpoint
        self.workspace_client = WorkspaceClient()
        self.model_serving_client: OpenAI = (
            self.workspace_client.serving_endpoints.get_open_ai_client()
        )
        self._tools_dict = {tool.name: tool for tool in tools}

    def get_tool_specs(self) -> list[dict]:
        """Returns tool specifications in the format OpenAI expects."""
        return [tool_info.spec for tool_info in self._tools_dict.values()]

    @mlflow.trace(span_type=SpanType.TOOL)
    def execute_tool(self, tool_name: str, args: dict) -> Any:
        """Executes the specified tool with the given arguments."""
        return self._tools_dict[tool_name].exec_fn(**args)

    def call_llm(self, messages: list[dict[str, Any]]) -> Generator[dict[str, Any], None, None]:
        with warnings.catch_warnings():
            warnings.filterwarnings("ignore", message="PydanticSerializationUnexpectedValue")
            for chunk in self.model_serving_client.chat.completions.create(
                model=self.llm_endpoint,
                messages=to_chat_completions_input(messages),
                tools=self.get_tool_specs(),
                stream=True,
            ):
                chunk_dict = chunk.to_dict()
                if len(chunk_dict.get("choices", [])) > 0:
                    yield chunk_dict

    def handle_tool_call(
        self,
        tool_call: dict[str, Any],
        messages: list[dict[str, Any]],
    ) -> ResponsesAgentStreamEvent:
        """
        Execute tool calls, add them to the running message history, and return a ResponsesStreamEvent w/ tool output
        """
        args = json.loads(tool_call.get("arguments") or "{}")
        result = str(self.execute_tool(tool_name=tool_call["name"], args=args))

        tool_call_output = self.create_function_call_output_item(tool_call["call_id"], result)
        messages.append(tool_call_output)
        return ResponsesAgentStreamEvent(type="response.output_item.done", item=tool_call_output)

    def call_and_run_tools(
        self,
        messages: list[dict[str, Any]],
        max_iter: int = 10,
    ) -> Generator[ResponsesAgentStreamEvent, None, None]:
        for _ in range(max_iter):
            last_msg = messages[-1]
            if last_msg.get("role", None) == "assistant":
                return
            elif last_msg.get("type", None) == "function_call":
                yield self.handle_tool_call(last_msg, messages)
            else:
                yield from output_to_responses_items_stream(
                    chunks=self.call_llm(messages), aggregator=messages
                )

        yield ResponsesAgentStreamEvent(
            type="response.output_item.done",
            item=self.create_text_output_item("Max iterations reached. Stopping.", str(uuid4())),
        )

    def predict(self, request: ResponsesAgentRequest) -> ResponsesAgentResponse:
        session_id = None
        if request.custom_inputs and "session_id" in request.custom_inputs:
            session_id = request.custom_inputs.get("session_id")
        elif request.context and request.context.conversation_id:
            session_id = request.context.conversation_id

        if session_id:
            mlflow.update_current_trace(
                metadata={
                    "mlflow.trace.session": session_id,
                }
            )

        outputs = [
            event.item
            for event in self.predict_stream(request)
            if event.type == "response.output_item.done"
        ]
        return ResponsesAgentResponse(output=outputs, custom_outputs=request.custom_inputs)

    def predict_stream(self, request: ResponsesAgentRequest) -> Generator[ResponsesAgentStreamEvent, None, None]:
        session_id = None
        if request.custom_inputs and "session_id" in request.custom_inputs:
            session_id = request.custom_inputs.get("session_id")
        elif request.context and request.context.conversation_id:
            session_id = request.context.conversation_id

        if session_id:
            mlflow.update_current_trace(
                metadata={
                    "mlflow.trace.session": session_id,
                }
            )

        messages = to_chat_completions_input([i.model_dump() for i in request.input])
        if SYSTEM_PROMPT:
            messages.insert(0, {"role": "system", "content": SYSTEM_PROMPT})
        yield from self.call_and_run_tools(messages=messages)


# Log the model using MLflow
mlflow.openai.autolog()
AGENT = ToolCallingAgent(llm_endpoint=LLM_ENDPOINT_NAME, tools=TOOL_INFOS)
mlflow.models.set_model(AGENT)

Writing agent.py


## Test the agent

Interact with the agent to test its output. Since we manually traced methods within `ResponsesAgent`, you can view the trace for each step the agent takes, with any LLM calls made via the OpenAI SDK automatically traced by autologging.

Replace this placeholder input with an appropriate domain-specific example for your agent.

In [0]:
dbutils.library.restartPython()

In [0]:
from agent import AGENT



In [0]:
# Test Case 1: Basic product feedback query
AGENT.predict({
    "input": [{"role": "user", "content": "What are customers saying about footwear sizing?"}],
    "custom_inputs": {"session_id": "test-sizing-001"}
})

[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.


ResponsesAgentResponse(tool_choice=None, truncation=None, id=None, created_at=None, error=None, incomplete_details=None, instructions=None, metadata=None, model=None, object='response', output=[OutputItem(type='message', id='msg_bdrk_01D1uLKME5QnEvb1NjxJd2QP', content=[{'text': "I'll search for customer feedback about footwear sizing to understand the common patterns and concerns.", 'type': 'output_text'}], role='assistant'), OutputItem(type='function_call', id='msg_bdrk_01D1uLKME5QnEvb1NjxJd2QP', call_id='toolu_bdrk_01J5XHwxaDvboWVzYB1gZn9m', name='juan_use1_catalog__retail__gold_customer_reviews_idx', arguments='{"query": "footwear sizing fit shoes boots sandals size runs small large true to size"}'), OutputItem(type='function_call_output', call_id='toolu_bdrk_01J5XHwxaDvboWVzYB1gZn9m', output='[{\'page_content\': "This was my first purchase here and The Urban Style Sandals 1967 in navy was okay. My feet don\'t hurt after hours of walking. However, runs at least a full size small. Ru

Trace(trace_id=tr-3c96eb1878d221c747215ea3a7b749b0)

In [0]:
# Test Case 2: Brand-specific query
AGENT.predict({
    "input": [{"role": "user", "content": "What do customers think about Eco Threads products?"}],
    "custom_inputs": {"session_id": "test-brand-001"}
})

[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.


ResponsesAgentResponse(tool_choice=None, truncation=None, id=None, created_at=None, error=None, incomplete_details=None, instructions=None, metadata=None, model=None, object='response', output=[OutputItem(type='message', id='msg_bdrk_01SLyTU49BsFKa2oBgHFo8Zq', content=[{'text': "I'll search for customer feedback about Eco Threads products to understand what customers are saying about this brand.", 'type': 'output_text'}], role='assistant'), OutputItem(type='function_call', id='msg_bdrk_01SLyTU49BsFKa2oBgHFo8Zq', call_id='toolu_bdrk_01GKfgwLCNYyU941UYrmRk7i', name='juan_use1_catalog__retail__gold_customer_reviews_idx', arguments='{"query": "Eco Threads products customer feedback reviews opinions"}'), OutputItem(type='function_call_output', call_id='toolu_bdrk_01GKfgwLCNYyU941UYrmRk7i', output="[{'page_content': 'As a regular customer, The Eco Threads Handbags 1083 in brown exceeded expectations. Elevates any outfit. The silk feels absolutely luxurious. Classic Comfort lives up to its na

Trace(trace_id=tr-e625df99d418766cd3dcd09a126ccced)

In [0]:
# Test Case 3: Quality issues query
AGENT.predict({
    "input": [{"role": "user", "content": "What are the main quality complaints for leather products?"}],
    "custom_inputs": {"session_id": "test-quality-001"}
})

# Test Case 4: Customer segment query
AGENT.predict({
    "input": [{"role": "user", "content": "What are VIP customers complaining about?"}],
    "custom_inputs": {"session_id": "test-vip-001"}
})

# Test Case 5: Return/experience query
AGENT.predict({
    "input": [{"role": "user", "content": "What are the main reasons customers return items?"}],
    "custom_inputs": {"session_id": "test-returns-001"}
})

In [0]:
for chunk in AGENT.predict_stream(
    {"input": [{"role": "user", "content": "What are the main reasons customers return items?"}], "custom_inputs": {"session_id": "test-session-123"}}
):
    print(chunk.model_dump(exclude_none=True))

### Log the `agent` as an MLflow model
Determine Databricks resources to specify for automatic auth passthrough at deployment time
- **TODO**: If your Unity Catalog Function queries a [vector search index](https://docs.databricks.com/generative-ai/agent-framework/unstructured-retrieval-tools.html) or leverages [external functions](https://docs.databricks.com/generative-ai/agent-framework/external-connection-tools.html), you need to include the dependent vector search index and UC connection objects, respectively, as resources. See [docs](https://docs.databricks.com/generative-ai/agent-framework/log-agent.html#specify-resources-for-automatic-authentication-passthrough) for more details.

Log the agent as code from the `agent.py` file. See [MLflow - Models from Code](https://mlflow.org/docs/latest/models.html#models-from-code).

In [0]:
# Determine Databricks resources to specify for automatic auth passthrough at deployment time
import mlflow
from agent import LLM_ENDPOINT_NAME, VECTOR_SEARCH_TOOLS, uc_toolkit
from mlflow.models.resources import DatabricksFunction, DatabricksServingEndpoint
from pkg_resources import get_distribution

resources = [DatabricksServingEndpoint(endpoint_name=LLM_ENDPOINT_NAME)]
for tool in VECTOR_SEARCH_TOOLS:
    resources.extend(tool.resources)
for tool in uc_toolkit.tools:
    # TODO: If the UC function includes dependencies like external connection or vector search, please include them manually.
    # See the TODO in the markdown above for more information.
    udf_name = tool.get("function", {}).get("name", "").replace("__", ".")
    resources.append(DatabricksFunction(function_name=udf_name))

input_example = {
    "input": [
        {
            "role": "user",
            "content": "What are the main sizing complaints for footwear products?"
        }
    ],
    "custom_inputs": {
        "session_id": "test-session"
    }
}

with mlflow.start_run():
    logged_agent_info = mlflow.pyfunc.log_model(
        name="agent",
        python_model="agent.py",
        input_example=input_example,
        pip_requirements=[
            "databricks-openai",
            "backoff",
            f"databricks-connect=={get_distribution('databricks-connect').version}",
        ],
        resources=resources,
    )

  from pkg_resources import get_distribution
ðŸ”— View Logged Model at: https://e2-demo-field-eng.cloud.databricks.com/ml/experiments/1879320557133611/models/m-a35b21b6000e434eb715999378a53a5c?o=1444828305810485
2026/01/25 13:56:11 INFO mlflow.pyfunc: Predicting on input example to validate output


[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.




[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.




[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.




[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.




[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.




[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.


## Evaluate the agent with [Agent Evaluation](https://docs.databricks.com/mlflow3/genai/eval-monitor)

You can edit the requests or expected responses in your evaluation dataset and run evaluation as you iterate your agent, leveraging mlflow to track the computed quality metrics.

Evaluate your agent with one of our [predefined LLM scorers](https://docs.databricks.com/mlflow3/genai/eval-monitor/predefined-judge-scorers), or try adding [custom metrics](https://docs.databricks.com/mlflow3/genai/eval-monitor/custom-scorers).

In [0]:
import mlflow
from mlflow.genai.scorers import RelevanceToQuery, Safety, RetrievalRelevance, RetrievalGroundedness

In [0]:
eval_dataset = [
    # Category-specific queries
    {
        "inputs": {"input": [{"role": "user", "content": "What are customers saying about footwear comfort?"}]},
        "expected_response": "Should mention comfort-related feedback from footwear reviews"
    },
    {
        "inputs": {"input": [{"role": "user", "content": "What sizing issues do customers report for apparel?"}]},
        "expected_response": "Should discuss sizing runs small/large patterns"
    },
    # Brand-specific queries
    {
        "inputs": {"input": [{"role": "user", "content": "How do customers rate Luxe Label quality?"}]},
        "expected_response": "Should reference luxury brand sentiment and quality feedback"
    },
    {
        "inputs": {"input": [{"role": "user", "content": "What do eco-conscious customers say about Eco Threads?"}]},
        "expected_response": "Should mention sustainability and eco-friendly feedback"
    },
    # Material-specific queries
    {
        "inputs": {"input": [{"role": "user", "content": "What do reviews say about leather product quality?"}]},
        "expected_response": "Should include leather-specific feedback patterns"
    },
    {
        "inputs": {"input": [{"role": "user", "content": "Are there complaints about cotton shrinkage?"}]},
        "expected_response": "Should reference cotton fabric feedback"
    },
    # Customer segment queries
    {
        "inputs": {"input": [{"role": "user", "content": "What are VIP customers unhappy about?"}]},
        "expected_response": "Should filter for VIP segment negative feedback"
    },
    {
        "inputs": {"input": [{"role": "user", "content": "What feedback do new customers give about their first purchase?"}]},
        "expected_response": "Should reference new customer segment feedback"
    },
    # Experience queries
    {
        "inputs": {"input": [{"role": "user", "content": "What do customers say about shipping speed?"}]},
        "expected_response": "Should discuss delivery/shipping related feedback"
    },
    {
        "inputs": {"input": [{"role": "user", "content": "What are the main reasons for product returns?"}]},
        "expected_response": "Should synthesize return feedback reasons"
    },
    # Complex multi-facet queries
    {
        "inputs": {"input": [{"role": "user", "content": "Compare customer satisfaction between budget and premium price tiers"}]},
        "expected_response": "Should contrast feedback across price segments"
    },
    {
        "inputs": {"input": [{"role": "user", "content": "What quality issues affect customer recommendations?"}]},
        "expected_response": "Should correlate quality issues with recommendation patterns"
    }
]

eval_results = mlflow.genai.evaluate(
    data=eval_dataset,
    predict_fn=lambda input: AGENT.predict({"input": input, "custom_inputs": {"session_id": "evaluation-session"}}),
    scorers=[
        RelevanceToQuery(),
        Safety(),
        RetrievalRelevance(),  # Add this - important for RAG
        RetrievalGroundedness()  # Add this - ensures response is grounded in retrieved docs
    ],
)

#  scorers=[RelevanceToQuery(), Safety()], # add more scorers here if they're applicable

2026/01/25 14:00:10 INFO mlflow.models.evaluation.utils.trace: Auto tracing is temporarily enabled during the model evaluation for computing some metrics and debugging. To disable tracing, call `mlflow.autolog(disable=True)`.
2026/01/25 14:00:10 INFO mlflow.genai.utils.data_validation: Testing model prediction with the first sample in the dataset. To disable this check, set the MLFLOW_GENAI_EVAL_SKIP_TRACE_VALIDATION environment variable to True.


[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.


Evaluating:   0%|          | 0/12 [Elapsed: 00:00, Remaining: ?] 

[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. T



[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.
[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True.


## Perform pre-deployment validation of the agent
Before registering and deploying the agent, we perform pre-deployment checks via the [mlflow.models.predict()](https://mlflow.org/docs/latest/python_api/mlflow.models.html#mlflow.models.predict) API. See [documentation](https://docs.databricks.com/machine-learning/model-serving/model-serving-debug.html#validate-inputs) for details

In [0]:
mlflow.models.predict(
    model_uri=f"runs:/{logged_agent_info.run_id}/agent",
    input_data={"input": [{"role": "user", "content": "Hello!"}], "custom_inputs": {"session_id": "validation-session"}},
    env_manager="uv",
)

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/12 [00:00<?, ?it/s]

2026/01/25 14:04:06 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'


Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/12 [00:00<?, ?it/s]

2026/01/25 14:04:08 INFO mlflow.utils.virtualenv: Creating a new environment in /tmp/virtualenv_envs/mlflow-638c8c4b28ffe026a41a13bf83433cd9d04818ef with python version 3.12.3 using uv
Using CPython 3.12.3 interpreter at: [36m/usr/bin/python3.12[39m
Creating virtual environment at: [36m/tmp/virtualenv_envs/mlflow-638c8c4b28ffe026a41a13bf83433cd9d04818ef[39m
Activate with: [32msource /tmp/virtualenv_envs/mlflow-638c8c4b28ffe026a41a13bf83433cd9d04818ef/bin/activate[39m
2026/01/25 14:04:09 INFO mlflow.utils.virtualenv: Installing dependencies
[2mUsing Python 3.12.3 environment at: /tmp/virtualenv_envs/mlflow-638c8c4b28ffe026a41a13bf83433cd9d04818ef[0m
[2mResolved [1m3 packages[0m [2min 122ms[0m[0m
[36m[1mDownloading[0m[39m pip [2m(1.8MiB)[0m
[36m[1mDownloading[0m[39m setuptools [2m(1.2MiB)[0m
 [36m[1mDownloaded[0m[39m pip
 [36m[1mDownloaded[0m[39m setuptools
[2mPrepared [1m3 packages[0m [2min 145ms[0m[0m
[2mInstalled [1m3 packages[0m [2min 20ms

{"object": "response", "output": [{"type": "message", "id": "msg_bdrk_01LUjpGrwF9osjrzyge5nvGE", "content": [{"text": "Hello! I'm your Fashion Retail Customer Insights Analyst. I'm here to help you understand what customers are saying about products, brands, and their shopping experiences through our customer review database.\n\nI can help you analyze customer feedback on topics like:\n\n- **Product Performance**: Sizing issues, quality concerns, comfort, style feedback\n- **Brand Sentiment**: How customers feel about specific brands like Luxe Label, Eco Threads, Street Wear Co, etc.\n- **Category Insights**: Patterns in apparel, footwear, or accessories reviews\n- **Customer Experience**: Purchase, shipping, and return experiences\n- **Segment Analysis**: Feedback differences between VIP, Premium, Loyal, Regular, and New customers\n\nWhat would you like to explore today? For example, you could ask:\n- \"What are customers saying about sizing issues?\"\n- \"How do VIP customers rate Lu

2026/01/25 14:04:34 INFO mlflow.tracing.export.async_export_queue: Flushing the async trace logging queue before program exit. This may take a while...


## Register the model to Unity Catalog

Update the `catalog`, `schema`, and `model_name` below to register the MLflow model to Unity Catalog.

In [0]:
mlflow.set_registry_uri("databricks-uc")

# TODO: define the catalog, schema, and model name for your UC model
catalog = "juan_use1_catalog"
schema = "retail"
model_name = "retail_customer_reviews_agent"
UC_MODEL_NAME = f"{catalog}.{schema}.{model_name}"

# register the model to UC
uc_registered_model_info = mlflow.register_model(
    model_uri=logged_agent_info.model_uri, name=UC_MODEL_NAME
)

Successfully registered model 'juan_use1_catalog.retail.retail_customer_reviews_agent'.


Downloading artifacts:   0%|          | 0/12 [00:00<?, ?it/s]

Uploading artifacts:   0%|          | 0/13 [00:00<?, ?it/s]

ðŸ”— Created version '1' of model 'juan_use1_catalog.retail.retail_customer_reviews_agent': https://e2-demo-field-eng.cloud.databricks.com/explore/data/models/juan_use1_catalog/retail/retail_customer_reviews_agent/version/1?o=1444828305810485


## Deploy the agent

In [0]:
from databricks import agents
# NOTE: pass scale_to_zero=True to agents.deploy() to enable scale-to-zero for cost savings.
# This is not recommended for production workloads, as capacity is not guaranteed when scaled to zero.
# Scaled to zero endpoints may take extra time to respond when queried, while they scale back up.
agents.deploy(UC_MODEL_NAME, uc_registered_model_info.version, tags = {"endpointSource": "playground"})

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

For more information, see: https://docs.databricks.com/aws/en/generative-ai/agent-framework/feedback-model



    Deployment of juan_use1_catalog.retail.retail_customer_reviews_agent version 1 initiated.  This can take up to 15 minutes and the Review App & Query Endpoint will not work until this deployment finishes.

    View status: https://e2-demo-field-eng.cloud.databricks.com/ml/endpoints/agents_juan_use1_catalog-retail-retail_customer_reviews_agent/?o=1444828305810485
    Review App: https://e2-demo-field-eng.cloud.databricks.com/ml/review-v2/489fc50325ec4532b0ad59df7523a114/chat?o=1444828305810485

You can refer back to the links above from the endpoint detail page at https://e2-demo-field-eng.cloud.databricks.com/ml/endpoints/agents_juan_use1_catalog-retail-retail_customer_reviews_agent/?o=1444828305810485.

To set up monitoring for your deployed agent, see:
https://docs.databricks.com/aws/en/mlflow3/genai/eval-monitor/production-monitoring


Deployment(model_name='juan_use1_catalog.retail.retail_customer_reviews_agent', model_version='1', endpoint_name='agents_juan_use1_catalog-retail-retail_customer_reviews_agent', served_entity_name='juan_use1_catalog-retail-retail_customer_reviews_agent_1', query_endpoint='https://e2-demo-field-eng.cloud.databricks.com/serving-endpoints/agents_juan_use1_catalog-retail-retail_customer_reviews_agent/served-models/juan_use1_catalog-retail-retail_customer_reviews_agent_1/invocations?o=1444828305810485', endpoint_url='https://e2-demo-field-eng.cloud.databricks.com/ml/endpoints/agents_juan_use1_catalog-retail-retail_customer_reviews_agent/?o=1444828305810485', review_app_url='https://e2-demo-field-eng.cloud.databricks.com/ml/review-v2/489fc50325ec4532b0ad59df7523a114/chat?o=1444828305810485')

## Next steps

After your agent is deployed, you can chat with it in AI playground to perform additional checks, share it with SMEs in your organization for feedback, or embed it in a production application. See [docs](https://docs.databricks.com/generative-ai/deploy-agent.html) for details