# Multi-Agent System with Genie + LLM Summarization

This notebook creates a multi-agent system where:
1. **Genie Agent** provides structured data (tables, statistics)
2. **Supervisor Agent** (Llama 3.1) creates natural language summaries
3. **Output includes BOTH** the table and the summary

## Prerequisites
- Genie Space created and configured
- Databricks serving endpoint access


In [None]:
%pip install -U -qqq langgraph-supervisor==0.0.30 mlflow[databricks] databricks-langchain databricks-agents uv 
dbutils.library.restartPython()


## Define the Enhanced Multi-Agent System

Key enhancement: The supervisor prompt explicitly instructs the LLM to:
- Preserve structured data from Genie
- Add a natural language summary
- Provide insights and key findings


In [None]:
%%writefile agent.py
import json
from typing import Generator, Literal
from uuid import uuid4

import mlflow
from databricks_langchain import (
    ChatDatabricks,
    DatabricksFunctionClient,
    UCFunctionToolkit,
    set_uc_function_client,
)
from databricks_langchain.genie import GenieAgent
from langchain_core.runnables import Runnable
from langchain.agents import create_agent
from langgraph.graph.state import CompiledStateGraph
from langgraph_supervisor import create_supervisor
from mlflow.pyfunc import ResponsesAgent
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
    output_to_responses_items_stream,
    to_chat_completions_input,
)
from pydantic import BaseModel

client = DatabricksFunctionClient()
set_uc_function_client(client)

########################################
# Agent Configuration Models
########################################

GENIE = "genie"


class ServedSubAgent(BaseModel):
    endpoint_name: str
    name: str
    task: Literal["agent/v1/responses", "agent/v1/chat", "agent/v2/chat"]
    description: str


class Genie(BaseModel):
    space_id: str
    name: str
    task: str = GENIE
    description: str


class InCodeSubAgent(BaseModel):
    tools: list[str]
    name: str
    description: str


TOOLS = []


def stringify_content(state):
    """Convert content to string format for processing"""
    msgs = state["messages"]
    if isinstance(msgs[-1].content, list):
        msgs[-1].content = json.dumps(msgs[-1].content, indent=4)
    return {"messages": msgs}


########################################
# Create LangGraph Supervisor with Enhanced Summarization
########################################


def create_langgraph_supervisor(
    llm: Runnable,
    externally_served_agents: list[ServedSubAgent] = [],
    in_code_agents: list[InCodeSubAgent] = [],
):
    agents = []
    agent_descriptions = ""

    # Process inline code agents
    for agent in in_code_agents:
        agent_descriptions += f"- {agent.name}: {agent.description}\n"
        uc_toolkit = UCFunctionToolkit(function_names=agent.tools)
        TOOLS.extend(uc_toolkit.tools)
        agents.append(create_agent(llm, tools=uc_toolkit.tools, name=agent.name))

    # Process served endpoints and Genie Spaces
    for agent in externally_served_agents:
        agent_descriptions += f"- {agent.name}: {agent.description}\n"
        if isinstance(agent, Genie):
            genie_agent = GenieAgent(
                genie_space_id=agent.space_id,
                genie_agent_name=agent.name,
                description=agent.description,
            )
            genie_agent.name = agent.name
            agents.append(genie_agent)
        else:
            model = ChatDatabricks(
                endpoint=agent.endpoint_name, use_responses_api="responses" in agent.task
            )
            # Disable streaming for subagents for ease of parsing
            model._stream = lambda x: model._stream(**x, stream=False)
            agents.append(
                create_agent(
                    model,
                    tools=[],
                    name=agent.name,
                    post_model_hook=stringify_content,
                )
            )

    # ENHANCED SUPERVISOR PROMPT with explicit summarization instructions
    prompt = f"""
You are a supervisor in a multi-agent system specialized in workforce analytics, talent mobility, and attrition analysis.

Your workflow:
1. **Understand the user's request** - What insights are they seeking?
2. **Check chat history** - Has this been answered already?
3. **Delegate if needed** - Route to the appropriate agent if information is missing
4. **Analyze & Summarize** - When an agent returns data:
   
   **CRITICAL: You MUST provide BOTH:**
   
   a) **Natural Language Summary** (2-4 sentences):
      - State the key finding that directly answers the user's question
      - Highlight the most significant insight or trend
      - Note any notable patterns, outliers, or anomalies
      - Use specific numbers and percentages from the data
      - Make it actionable and business-relevant
   
   b) **Preserve the structured data** - Keep tables and detailed results intact
   
5. **Response Format:**
   ```
   [Your natural language summary here - clear, insightful, specific]
   
   [Detailed data/table follows]
   ```

**Example Good Response:**
"The analysis reveals that Sales has the highest attrition rate at 15.2%, nearly double the company average of 8.1%. This is primarily driven by high turnover in the first year (23% of new Sales hires leave within 12 months). Engineering shows the strongest retention at 6.3%, suggesting successful retention programs in technical roles."

[Table with detailed breakdown by department]

Available agents:
{agent_descriptions}

Remember: Always provide actionable insights with specific data points, not just descriptions of what the data shows.
"""

    return create_supervisor(
        agents=agents,
        model=llm,
        prompt=prompt,
        add_handoff_messages=False,
        output_mode="full_history",  # Keep full history to get both summary and table
    ).compile()


##########################################
# Wrap LangGraph Supervisor as a ResponsesAgent
##########################################


class LangGraphResponsesAgent(ResponsesAgent):
    def __init__(self, agent: CompiledStateGraph):
        self.agent = agent

    def predict(self, request: ResponsesAgentRequest) -> ResponsesAgentResponse:
        outputs = [
            event.item
            for event in self.predict_stream(request)
            if event.type == "response.output_item.done"
        ]
        return ResponsesAgentResponse(output=outputs, custom_outputs=request.custom_inputs)

    def predict_stream(
        self,
        request: ResponsesAgentRequest,
    ) -> Generator[ResponsesAgentStreamEvent, None, None]:
        cc_msgs = to_chat_completions_input([i.model_dump() for i in request.input])
        first_message = True
        seen_ids = set()

        for _, events in self.agent.stream({"messages": cc_msgs}, stream_mode=["updates"]):
            new_msgs = [
                msg
                for v in events.values()
                for msg in v.get("messages", [])
                if msg.id not in seen_ids
            ]
            if first_message:
                seen_ids.update(msg.id for msg in new_msgs[: len(cc_msgs)])
                new_msgs = new_msgs[len(cc_msgs) :]
                first_message = False
            else:
                seen_ids.update(msg.id for msg in new_msgs)
                node_name = tuple(events.keys())[0]
                yield ResponsesAgentStreamEvent(
                    type="response.output_item.done",
                    item=self.create_text_output_item(
                        text=f"<name>{node_name}</name>", id=str(uuid4())
                    ),
                )
            if len(new_msgs) > 0:
                yield from output_to_responses_items_stream(new_msgs)


#######################################################
# Configure Foundation Model and Sub-Agents
#######################################################

# Foundation model for supervisor (will generate summaries)
LLM_ENDPOINT_NAME = "databricks-meta-llama-3-1-8b-instruct"
llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)

# Configure your Genie Space
EXTERNALLY_SERVED_AGENTS = [
    Genie(
        space_id="01f0c9f705201d14b364f5daf28bb639",  # TODO: Update with your Genie Space ID
        name="talent_genie",
        description="Analyzes talent stability, mobility patterns, attrition risk, and workforce trends. Provides structured data including statistics, tables, and detailed breakdowns by department, role, tenure, and other dimensions."
    ),
]

# Optional: Add UC function-calling agents
IN_CODE_AGENTS = []

#################################################
# Create Supervisor and Set Up MLflow
#################################################

supervisor = create_langgraph_supervisor(llm, EXTERNALLY_SERVED_AGENTS, IN_CODE_AGENTS)

mlflow.langchain.autolog()
AGENT = LangGraphResponsesAgent(supervisor)
mlflow.models.set_model(AGENT)


## Test the Agent

Test the agent locally before deploying. You should see:
1. **Summary** from the supervisor (natural language insights)
2. **Table** from Genie (structured data)


In [None]:
dbutils.library.restartPython()


In [None]:
from agent import AGENT

# Test with a question that will require Genie to query data
input_example = {
    "input": [
        {"role": "user", "content": "Which department has the highest attrition rate?"}
    ]
}

# Get the response
response = AGENT.predict(input_example)
print(response)


In [None]:
# Test streaming to see the flow
print("=" * 80)
print("STREAMING OUTPUT (shows agent handoffs and responses)")
print("=" * 80)

for event in AGENT.predict_stream(input_example):
    output = event.model_dump(exclude_none=True)
    
    # Extract and display content
    if 'item' in output and 'content' in output['item']:
        for content_item in output['item']['content']:
            if 'text' in content_item:
                text = content_item['text']
                
                # Highlight agent names
                if text.startswith('<name>'):
                    print(f"\n{'='*60}")
                    print(f"âžœ Agent: {text}")
                    print(f"{'='*60}\n")
                else:
                    print(text)


## Log the Agent to MLflow

Log the agent with automatic authentication for Databricks resources.


In [None]:
import mlflow
from agent import EXTERNALLY_SERVED_AGENTS, LLM_ENDPOINT_NAME, TOOLS, Genie
from databricks_langchain import UnityCatalogTool, VectorSearchRetrieverTool
from mlflow.models.resources import (
    DatabricksFunction,
    DatabricksGenieSpace,
    DatabricksServingEndpoint,
    DatabricksSQLWarehouse,
    DatabricksTable
)
from pkg_resources import get_distribution

# Configure resources for automatic authentication
resources = [DatabricksServingEndpoint(endpoint_name=LLM_ENDPOINT_NAME)]

# Add SQL Warehouse and tables for Genie Space
# TODO: Update these with your actual warehouse and table names
resources.append(DatabricksSQLWarehouse(warehouse_id="148ccb90800933a1"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_attrition_snapshots"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.dim_employees"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_compensation"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_performance"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_role_history"))

# Add UC function tools if any
for tool in TOOLS:
    if isinstance(tool, VectorSearchRetrieverTool):
        resources.extend(tool.resources)
    elif isinstance(tool, UnityCatalogTool):
        resources.append(DatabricksFunction(function_name=tool.uc_function_name))

# Add Genie Space
for agent in EXTERNALLY_SERVED_AGENTS:
    if isinstance(agent, Genie):
        resources.append(DatabricksGenieSpace(genie_space_id=agent.space_id))
    else:
        resources.append(DatabricksServingEndpoint(endpoint_name=agent.endpoint_name))

# Log the model
with mlflow.start_run():
    logged_agent_info = mlflow.pyfunc.log_model(
        name="agent",
        python_model="agent.py",
        resources=resources,
        pip_requirements=[
            f"databricks-connect=={get_distribution('databricks-connect').version}",
            f"mlflow=={get_distribution('mlflow').version}",
            f"databricks-langchain=={get_distribution('databricks-langchain').version}",
            f"langgraph=={get_distribution('langgraph').version}",
            f"langgraph-supervisor=={get_distribution('langgraph-supervisor').version}",
        ],
    )

print(f"âœ… Model logged successfully!")
print(f"Run ID: {logged_agent_info.run_id}")
print(f"Model URI: {logged_agent_info.model_uri}")


## Register to Unity Catalog


In [None]:
mlflow.set_registry_uri("databricks-uc")

# TODO: Update these with your catalog, schema, and model name
catalog = "akash_s_demo"
schema = "talent"
model_name = "mobility_attrition_with_summary"
UC_MODEL_NAME = f"{catalog}.{schema}.{model_name}"

# Register the model
uc_registered_model_info = mlflow.register_model(
    model_uri=logged_agent_info.model_uri, name=UC_MODEL_NAME
)

print(f"âœ… Model registered to Unity Catalog!")
print(f"Model: {UC_MODEL_NAME}")
print(f"Version: {uc_registered_model_info.version}")


## Deploy the Agent

Deploy the agent to a serving endpoint.


In [None]:
from databricks import agents

# Deploy the agent
deployment_info = agents.deploy(
    UC_MODEL_NAME, 
    uc_registered_model_info.version,
    tags={"enhanced": "with_summary"},
    deploy_feedback_model=False
)

print("\n" + "="*80)
print("ðŸš€ DEPLOYMENT INITIATED")
print("="*80)
print("\nYour agent with enhanced summarization is being deployed!")
print("\nðŸ“Š What to expect:")
print("  â€¢ Natural language summaries from Llama 3.1")
print("  â€¢ Structured tables from Genie")
print("  â€¢ Both in a single response")
print("\nThis deployment can take up to 15 minutes.")
print("\n" + "="*80)


## Example Output

When you ask: **"Which department has the highest attrition rate?"**

You will get:

### From Supervisor (Summary):
```
The Sales department shows the highest attrition rate at 15.2%, significantly above 
the company average of 8.1%. This elevated turnover is concentrated among employees 
with less than 2 years tenure, suggesting onboarding or early-career challenges. 
The Engineering department maintains the strongest retention at 6.3%, indicating 
effective retention strategies in technical roles.
```

### From Genie (Table):
```
| Department  | Attrition Rate | Employee Count | Avg Tenure |
|-------------|----------------|----------------|------------|
| Sales       | 15.2%          | 450            | 2.3 years  |
| Support     | 12.8%          | 320            | 2.8 years  |
| Marketing   | 10.5%          | 180            | 3.2 years  |
| Operations  | 9.2%           | 280            | 3.8 years  |
| Engineering | 6.3%           | 520            | 4.5 years  |
```

## Key Improvements

âœ… **Natural Language Insights** - Easy to understand summaries  
âœ… **Structured Data** - Detailed tables for analysis  
âœ… **Actionable** - Specific numbers and trends  
âœ… **Context-Aware** - Supervisor analyzes the data intelligently
