# Async Agentic Workflow with Databricks Lakeflow Jobs

This notebook demonstrates how to use Databricks Lakeflow Jobs to execute async agentic workflows.

**Key Demo Points:**
- Interactive agent for research planning (Planner Agent)
- Async job execution via Lakeflow (Researcher Agent)
- Non-blocking polling for job status
- Results saved to Unity Catalog Volume

## Flow
1. Converse with Planner Agent to create a research plan
2. Approve the plan â†’ triggers async Lakeflow Job
3. Continue working while job runs (poll for updates as needed)
4. Retrieve completed report from UC Volume

## Setup

In [None]:
%pip install databricks-sdk databricks-mcp openai pydantic --quiet
dbutils.library.restartPython()

In [None]:
import sys
import os

# Add src to path for imports
# Update this path to match your workspace location
SRC_PATH = "/Workspace/Users/{your_email}/agent-job-run/src"
sys.path.insert(0, SRC_PATH)

# Verify imports work
from models.research_plan import ResearchPlan
from planner_agent import PlannerAgent
from job_tools import check_job_status

print("Imports successful!")

## Configuration

Update these values for your environment:

In [None]:
# ============================================
# CONFIGURATION - UPDATE THESE VALUES
# ============================================

# LLM endpoint (Databricks Foundation Model)
LLM_ENDPOINT = "databricks-claude-sonnet-4"

# Path to the researcher notebook in your workspace
RESEARCHER_NOTEBOOK_PATH = "/Workspace/Users/{your_email}/agent-job-run/src/notebooks/02_researcher_job"

# UC Volume path for output reports
# Format: /Volumes/{catalog}/{schema}/{volume}
OUTPUT_VOLUME_PATH = "/Volumes/{catalog}/{schema}/{volume}"

print(f"LLM Endpoint: {LLM_ENDPOINT}")
print(f"Researcher Notebook: {RESEARCHER_NOTEBOOK_PATH}")
print(f"Output Volume: {OUTPUT_VOLUME_PATH}")

## Initialize Planner Agent

In [None]:
from databricks.sdk import WorkspaceClient

ws = WorkspaceClient()
print(f"Connected to: {ws.config.host}")

agent = PlannerAgent(
    llm_endpoint=LLM_ENDPOINT,
    researcher_notebook_path=RESEARCHER_NOTEBOOK_PATH,
    output_volume_path=OUTPUT_VOLUME_PATH,
    workspace_client=ws,
)

print("Planner Agent initialized!")

## Interactive Conversation

Use the `agent.chat()` method to converse with the Planner Agent.

**Example conversation flow:**
1. "I want to research the impact of AI on healthcare"
2. Agent proposes research questions
3. "Looks good, but add a question about regulatory challenges"
4. Agent updates questions
5. "Perfect, let's run it"
6. Agent submits async job and returns run_id

In [None]:
# Start the conversation - describe your research topic
response = agent.chat("I want to research trends in generative AI adoption in enterprise")
print(response)

In [None]:
# Continue the conversation - refine the plan
response = agent.chat("Those questions look good. Let's run the research.")
print(response)

## Monitor Job Status

The research job is now running asynchronously. You can:
- Continue with other work in this notebook
- Poll for job status as needed
- The main agent doesn't wait for the job to complete

In [None]:
# Check status of active jobs
active_jobs = agent.get_active_jobs()
for job in active_jobs:
    print(f"Run ID: {job['run_id']}")
    print(f"  Topic: {job['topic']}")
    print(f"  State: {job['state']}")
    print(f"  Output: {job['output_path']}")
    print()

In [None]:
# Ask the agent about job status
# Replace with actual run_id from the job submission
response = agent.chat("What's the status of the research job?")
print(response)

## Retrieve Results

Once the job completes, retrieve the research report:

In [None]:
# Ask the agent to retrieve the report
# Only works after job completes
response = agent.chat("Show me the research report")
print(response)

In [None]:
# Alternative: Read directly from UC Volume
# Replace with actual output path
# output_path = "/Volumes/{catalog}/{schema}/{volume}/report_xxx.md"
# with open(output_path, 'r') as f:
#     report = f.read()
# print(report)

## Key Takeaways

This demo shows:

1. **Async Execution**: The Planner Agent kicks off a Lakeflow Job and returns immediately - it doesn't wait.

2. **Long-Running Tasks**: The Researcher Agent can run for >5 minutes (up to the job timeout) without blocking.

3. **Non-Blocking Polling**: Check job status anytime without waiting for completion.

4. **UC Volume Output**: Results are persisted to Unity Catalog Volume for reliable retrieval.

5. **MCP Tool Integration**: The Researcher Agent uses Databricks MCP for web search capabilities.