# Unity Catalog Data Advisor Agent - Test Notebook

This notebook tests the Unity Catalog Data Advisor Agent with Peloton data queries.

## Setup
1. Upload all project files to your Databricks workspace
2. Make sure `config.py` has your `LLM_ENDPOINT_NAME` and `VECTOR_SEARCH_INDEX_NAME` configured
3. Run the cells below to test the agent


In [None]:
%pip install -U -qqqq backoff databricks-openai uv databricks-agents mlflow-skinny[databricks]
dbutils.library.restartPython()


In [None]:
# Enable MLflow OpenAI autologging before importing the agent
import mlflow
mlflow.openai.autolog()

# Import the agent
from agent import AGENT
from mlflow.types.responses import ResponsesAgentRequest

print("Agent loaded successfully!")
print(f"LLM Endpoint: {AGENT.llm_endpoint}")
print(f"Available tools: {list(AGENT._tools_dict.keys())}")


## Test 1: Peloton Data Discovery Query


In [None]:
# Test query about Peloton data
request = ResponsesAgentRequest(
    input=[{
        "role": "user", 
        "content": "What Peloton data do we have available? I need to analyze Peloton customer behavior and usage patterns."
    }],
    custom_inputs={"session_id": "peloton-test-session"}
)

print("Query: What Peloton data do we have available?")
print("=" * 80)
print()

response = AGENT.predict(request)

# Print the response in a readable format
for output in response.output:
    if hasattr(output, 'content'):
        print(output.content)
    elif isinstance(output, dict) and 'content' in output:
        print(output['content'])
    else:
        print(output)


## Test 2: Streaming Response


In [None]:
# Test streaming response
request = ResponsesAgentRequest(
    input=[{
        "role": "user", 
        "content": "What tables contain Peloton sales or transaction data?"
    }],
    custom_inputs={"session_id": "streaming-test-session"}
)

print("Query: What tables contain Peloton sales or transaction data?")
print("=" * 80)
print()

for chunk in AGENT.predict_stream(request):
    chunk_data = chunk.model_dump(exclude_none=True)
    if chunk_data.get("type") == "response.output_item.done":
        item = chunk_data.get("item", {})
        if item.get("type") == "text":
            print(item.get("content", ""), end="", flush=True)
        elif item.get("type") == "function_call_output":
            print(f"\n[Tool Output]: {item.get('content', '')}")


## Test 3: Custom Query

Modify the query below to test your own data discovery questions.

In [None]:
# Custom query - modify as needed
custom_query = "What datasets are available for analyzing customer engagement?"

request = ResponsesAgentRequest(
    input=[{"role": "user", "content": custom_query}],
    custom_inputs={"session_id": "custom-query-session"}
)

print(f"Query: {custom_query}")
print("=" * 80)
print()

response = AGENT.predict(request)

for output in response.output:
    if hasattr(output, 'content'):
        print(output.content)
    elif isinstance(output, dict) and 'content' in output:
        print(output['content'])
    else:
        print(output)


## Troubleshooting

If you encounter errors:

1. **Check configuration**: Verify `LLM_ENDPOINT_NAME` and `VECTOR_SEARCH_INDEX_NAME` in `config.py`
2. **Check permissions**: Ensure you have access to:
   - The model serving endpoint
   - The vector search index
   - Unity Catalog functions (system.ai.python_exec)
3. **Check MLflow traces**: View traces in the MLflow UI to see tool calls and LLM interactions
