# Multi-Agent System with Genie + LLM Summarization

This notebook creates a multi-agent system where:
1. **Genie Agent** provides structured data (tables, statistics)
2. **Supervisor Agent** (Llama 3.1) creates natural language summaries
3. **Output includes BOTH** the table and the summary

## Prerequisites
- Genie Space created and configured
- Databricks serving endpoint access


In [None]:
%pip install -U -qqq langgraph-supervisor==0.0.30 mlflow[databricks] databricks-langchain databricks-agents uv 
dbutils.library.restartPython()


## Define the Multi-Agent System with Explicit Graph Structure

### Graph Architecture (Nodes and Edges):

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ User Question‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
       ‚îÇ
       ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Genie Node  ‚îÇ  ‚Üê Queries data, returns table
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
       ‚îÇ (Forced Edge)
       ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇSupervisor Node‚îÇ ‚Üê Creates 2-line summary + preserves table
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
       ‚îÇ
       ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ User Response‚îÇ  ‚Üê Returns summary + table
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

### Key Features:
- ‚úÖ **Explicit routing**: Genie output ALWAYS goes to Supervisor
- ‚úÖ **No conditional logic**: Simple linear flow
- ‚úÖ **Guaranteed format**: Supervisor always provides 2-line summary + table


In [None]:
%%writefile agent.py
import json
from typing import Generator, Literal
from uuid import uuid4

import mlflow
from databricks_langchain import (
    ChatDatabricks,
    DatabricksFunctionClient,
    UCFunctionToolkit,
    set_uc_function_client,
)
from databricks_langchain.genie import GenieAgent
from langchain_core.runnables import Runnable
from langchain.agents import create_agent
from langgraph.graph.state import CompiledStateGraph
from langgraph_supervisor import create_supervisor
from mlflow.pyfunc import ResponsesAgent
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
    output_to_responses_items_stream,
    to_chat_completions_input,
)
from pydantic import BaseModel

client = DatabricksFunctionClient()
set_uc_function_client(client)

########################################
# Agent Configuration Models
########################################

GENIE = "genie"


class ServedSubAgent(BaseModel):
    endpoint_name: str
    name: str
    task: Literal["agent/v1/responses", "agent/v1/chat", "agent/v2/chat"]
    description: str


class Genie(BaseModel):
    space_id: str
    name: str
    task: str = GENIE
    description: str


class InCodeSubAgent(BaseModel):
    tools: list[str]
    name: str
    description: str


TOOLS = []


def stringify_content(state):
    """Convert content to string format for processing"""
    msgs = state["messages"]
    if isinstance(msgs[-1].content, list):
        msgs[-1].content = json.dumps(msgs[-1].content, indent=4)
    return {"messages": msgs}


########################################
# Create Custom LangGraph with Explicit Nodes and Edges
########################################

from langgraph.graph import StateGraph, END
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from typing import TypedDict, Annotated
import operator


# Define the state structure
class AgentState(TypedDict):
    messages: Annotated[list, operator.add]


def create_langgraph_with_nodes(
    llm: Runnable,
    externally_served_agents: list[ServedSubAgent] = [],
):
    """
    Create a LangGraph with explicit nodes and edges:
    - User Question ‚Üí Genie Node (gets data)
    - Genie Response ‚Üí Supervisor Node (summarizes)
    - Supervisor Response ‚Üí END (returns to user)
    """
    
    # Create Genie agent
    genie_agent = None
    for agent in externally_served_agents:
        if isinstance(agent, Genie):
            genie_agent = GenieAgent(
                genie_space_id=agent.space_id,
                genie_agent_name=agent.name,
                description=agent.description,
            )
            genie_agent.name = agent.name
            break
    
    if not genie_agent:
        raise ValueError("Genie agent is required")
    
    # Define Genie node function
    def genie_node(state: AgentState):
        """Genie node - queries data and returns structured results"""
        messages = state["messages"]
        
        print(f"DEBUG Genie - Input messages: {len(messages)}")
        
        # Invoke Genie
        response = genie_agent.invoke({"messages": messages})
        
        print(f"DEBUG Genie - Response type: {type(response)}")
        print(f"DEBUG Genie - Response keys: {response.keys() if isinstance(response, dict) else 'Not a dict'}")
        
        # GenieAgent returns the response differently - check for 'output' or last message
        genie_output = None
        
        # Try to get the actual Genie response
        if isinstance(response, dict):
            # Check if there's an 'output' field (common in agent responses)
            if 'output' in response:
                genie_output = response['output']
                print(f"DEBUG Genie - Found 'output' field: {str(genie_output)[:200]}")
            # Check if messages were appended
            elif 'messages' in response and len(response['messages']) > len(messages):
                new_msgs = response['messages'][len(messages):]
                genie_output = new_msgs[-1] if new_msgs else None
                print(f"DEBUG Genie - Found new messages: {len(new_msgs)}")
            # Otherwise, get the last message which should have Genie's response
            elif 'messages' in response and response['messages']:
                last_msg = response['messages'][-1]
                genie_output = last_msg
                print(f"DEBUG Genie - Using last message: {type(last_msg)}")
        
        # Convert to AIMessage if needed
        if genie_output:
            if isinstance(genie_output, str):
                genie_message = AIMessage(content=genie_output, name="genie")
            elif hasattr(genie_output, 'content'):
                genie_message = AIMessage(content=genie_output.content, name="genie")
            elif isinstance(genie_output, dict) and 'content' in genie_output:
                genie_message = AIMessage(content=genie_output['content'], name="genie")
            else:
                genie_message = AIMessage(content=str(genie_output), name="genie")
            
            print(f"DEBUG Genie - Created AIMessage with content length: {len(genie_message.content)}")
            return {"messages": [genie_message]}
        else:
            print("ERROR Genie - No output found!")
            return {"messages": [AIMessage(content="Genie returned no data.", name="genie")]}
    
    # Define Supervisor node function
    def supervisor_node(state: AgentState):
        """Supervisor node - summarizes Genie's data into 2-line summary + table"""
        messages = state["messages"]
        
        # Get ALL messages - find the one from Genie (should be AI message after user message)
        genie_response = ""
        
        # Look for the last AI message (from Genie)
        for msg in reversed(messages):
            if hasattr(msg, 'content') and msg.content:
                # Check if it's an AI message and has actual content
                if isinstance(msg, AIMessage) or (hasattr(msg, 'type') and msg.type == 'ai'):
                    content = str(msg.content)
                    # Skip if it's too short or empty
                    if content and len(content.strip()) > 10:
                        genie_response = content
                        break
        
        # Debug: Print what we got from Genie
        print(f"DEBUG - Messages count: {len(messages)}")
        print(f"DEBUG - Genie response length: {len(genie_response) if genie_response else 0}")
        if genie_response:
            print(f"DEBUG - Genie response preview: {genie_response[:200]}...")
        
        if not genie_response or len(genie_response.strip()) < 10:
            # If still no response, get the full state for debugging
            error_msg = f"No data received from Genie. Messages in state: {len(messages)}"
            print(f"ERROR: {error_msg}")
            for i, msg in enumerate(messages):
                print(f"  Message {i}: type={type(msg).__name__}, has_content={hasattr(msg, 'content')}")
                if hasattr(msg, 'content'):
                    content_preview = str(msg.content)[:100]
                    print(f"    Content preview: {content_preview}")
            return {"messages": [AIMessage(content=error_msg)]}
        
        # Create supervisor prompt - be VERY explicit
        system_prompt = """You are a data analyst. Your job is to write a 2-line summary and include the original table.

OUTPUT FORMAT (copy exactly):
[Line 1: Key finding with number]
[Line 2: Second insight]

[PASTE THE ORIGINAL TABLE HERE]

EXAMPLE:
Sales has highest attrition at 15.2%, above the 8.1% average.
Engineering shows best retention at 6.3% with effective programs.

| Department | Rate  | Count |
|------------|-------|-------|
| Sales      | 15.2% | 450   |
| Engineering| 6.3%  | 520   |

RULES:
- Write EXACTLY 2 short lines analyzing the data
- Add blank line
- Copy the COMPLETE original table unchanged
- That's it - nothing else"""
        
        # Create messages for LLM with explicit instruction
        user_prompt = f"""Here is the data with a table:

{genie_response}

Instructions:
1. Write 2 lines summarizing the key findings
2. Include the complete table from above

Your response:"""
        
        supervisor_messages = [
            SystemMessage(content=system_prompt),
            HumanMessage(content=user_prompt)
        ]
        
        # Get summary from LLM
        summary_response = llm.invoke(supervisor_messages)
        
        # Combine summary with original table to ensure table is preserved
        final_response = summary_response.content
        
        # If the table isn't in the response, append it
        if '|' not in final_response and '|' in genie_response:
            print("DEBUG - Table not in LLM response, appending original table")
            final_response = f"{final_response}\n\n{genie_response}"
        
        print(f"DEBUG Supervisor - Final response length: {len(final_response)}")
        print(f"DEBUG Supervisor - Response preview: {final_response[:300]}")
        
        # Return as a single message with the summary content
        return {"messages": [AIMessage(content=final_response, name="supervisor")]}
    
    # Build the graph with explicit nodes and edges
    workflow = StateGraph(AgentState)
    
    # Add nodes
    workflow.add_node("genie", genie_node)
    workflow.add_node("supervisor", supervisor_node)
    
    # Define edges: User ‚Üí Genie ‚Üí Supervisor ‚Üí END
    workflow.set_entry_point("genie")
    workflow.add_edge("genie", "supervisor")  # Genie ALWAYS goes to Supervisor
    workflow.add_edge("supervisor", END)       # Supervisor ALWAYS returns to user
    
    return workflow.compile()


##########################################
# Wrap LangGraph Supervisor as a ResponsesAgent
##########################################


class LangGraphResponsesAgent(ResponsesAgent):
    def __init__(self, agent: CompiledStateGraph):
        self.agent = agent

    def predict(self, request: ResponsesAgentRequest) -> ResponsesAgentResponse:
        outputs = [
            event.item
            for event in self.predict_stream(request)
            if event.type == "response.output_item.done"
        ]
        return ResponsesAgentResponse(output=outputs, custom_outputs=request.custom_inputs)

    def predict_stream(
        self,
        request: ResponsesAgentRequest,
    ) -> Generator[ResponsesAgentStreamEvent, None, None]:
        cc_msgs = to_chat_completions_input([i.model_dump() for i in request.input])
        seen_ids = set()
        
        # Track input message IDs to skip them
        input_msg_ids = set()

        for _, events in self.agent.stream({"messages": cc_msgs}, stream_mode=["updates"]):
            node_name = tuple(events.keys())[0] if events else "unknown"
            
            # Get messages from this node
            new_msgs = []
            for v in events.values():
                for msg in v.get("messages", []):
                    if msg.id not in seen_ids and msg.id not in input_msg_ids:
                        new_msgs.append(msg)
                        seen_ids.add(msg.id)
            
            # Emit node name tag
            if new_msgs:
                yield ResponsesAgentStreamEvent(
                    type="response.output_item.done",
                    item=self.create_text_output_item(
                        text=f"<name>{node_name}</name>", id=str(uuid4())
                    ),
                )
                
                # Emit the actual messages
                yield from output_to_responses_items_stream(new_msgs)


#######################################################
# Configure Foundation Model and Sub-Agents
#######################################################

# Foundation model for supervisor (will generate summaries)
LLM_ENDPOINT_NAME = "databricks-meta-llama-3-1-8b-instruct"
llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)

# Configure your Genie Space
EXTERNALLY_SERVED_AGENTS = [
    Genie(
        space_id="01f0c9f705201d14b364f5daf28bb639",  # TODO: Update with your Genie Space ID
        name="talent_genie",
        description="Analyzes talent stability, mobility patterns, attrition risk, and workforce trends. Provides structured data including statistics, tables, and detailed breakdowns by department, role, tenure, and other dimensions."
    ),
]

# Optional: Add UC function-calling agents
IN_CODE_AGENTS = []

#################################################
# Create Graph with Explicit Nodes and Edges
#################################################

# Create the graph: User ‚Üí Genie ‚Üí Supervisor ‚Üí END
supervisor = create_langgraph_with_nodes(llm, EXTERNALLY_SERVED_AGENTS)

print("‚úì Graph created with explicit flow:")
print("  User Question ‚Üí Genie Node ‚Üí Supervisor Node ‚Üí User Response")

mlflow.langchain.autolog()
AGENT = LangGraphResponsesAgent(supervisor)
mlflow.models.set_model(AGENT)


## Visualize Graph Structure

Display the node and edge structure of the LangGraph.


In [None]:
from agent import supervisor
from IPython.display import Image, display

try:
    # Try to generate graph visualization
    graph_image = supervisor.get_graph().draw_mermaid_png()
    display(Image(graph_image))
    print("‚úì Graph visualization displayed above")
except Exception as e:
    print(f"Could not generate graph image: {e}")
    print("\nGraph Structure (text):")
    print("=" * 60)
    print("START")
    print("  ‚Üì")
    print("‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê")
    print("‚îÇ genie      ‚îÇ  ‚Üê Queries Genie Space for data")
    print("‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò")
    print("      ‚Üì (forced edge)")
    print("‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê")
    print("‚îÇ supervisor ‚îÇ  ‚Üê Creates 2-line summary + table")
    print("‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò")
    print("      ‚Üì")
    print("    END")
    print("=" * 60)
    print("\nFlow:")
    print("1. User question ‚Üí genie node")
    print("2. Genie returns table ‚Üí supervisor node")
    print("3. Supervisor adds 2-line summary ‚Üí END")
    print("4. Response: Summary + Table ‚Üí User")


## Test the Agent

Test the agent locally before deploying. You should see:
1. **Summary** from the supervisor (natural language insights)
2. **Table** from Genie (structured data)


In [None]:
dbutils.library.restartPython()


In [None]:
from agent import AGENT

# Test with a question that will require Genie to query data
input_example = {
    "input": [
        {"role": "user", "content": "Which department has the highest attrition rate?"}
    ]
}

# Get the response
response = AGENT.predict(input_example)
print(response)


In [None]:
# Test streaming to see the flow
print("=" * 80)
print("STREAMING OUTPUT (shows agent handoffs and responses)")
print("=" * 80)

for event in AGENT.predict_stream(input_example):
    output = event.model_dump(exclude_none=True)
    
    # Extract and display content
    if 'item' in output and 'content' in output['item']:
        for content_item in output['item']['content']:
            if 'text' in content_item:
                text = content_item['text']
                
                # Highlight agent names
                if text.startswith('<name>'):
                    print(f"\n{'='*60}")
                    print(f"‚ûú Agent: {text}")
                    print(f"{'='*60}\n")
                else:
                    print(text)


## Log the Agent to MLflow

Log the agent with automatic authentication for Databricks resources.


In [None]:
import mlflow
from agent import EXTERNALLY_SERVED_AGENTS, LLM_ENDPOINT_NAME, TOOLS, Genie
from databricks_langchain import UnityCatalogTool, VectorSearchRetrieverTool
from mlflow.models.resources import (
    DatabricksFunction,
    DatabricksGenieSpace,
    DatabricksServingEndpoint,
    DatabricksSQLWarehouse,
    DatabricksTable
)
from pkg_resources import get_distribution

# Configure resources for automatic authentication
resources = [DatabricksServingEndpoint(endpoint_name=LLM_ENDPOINT_NAME)]

# Add SQL Warehouse and tables for Genie Space
# TODO: Update these with your actual warehouse and table names
resources.append(DatabricksSQLWarehouse(warehouse_id="148ccb90800933a1"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_attrition_snapshots"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.dim_employees"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_compensation"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_performance"))
resources.append(DatabricksTable(table_name="akash_s_demo.talent.fact_role_history"))

# Add UC function tools if any
for tool in TOOLS:
    if isinstance(tool, VectorSearchRetrieverTool):
        resources.extend(tool.resources)
    elif isinstance(tool, UnityCatalogTool):
        resources.append(DatabricksFunction(function_name=tool.uc_function_name))

# Add Genie Space
for agent in EXTERNALLY_SERVED_AGENTS:
    if isinstance(agent, Genie):
        resources.append(DatabricksGenieSpace(genie_space_id=agent.space_id))
    else:
        resources.append(DatabricksServingEndpoint(endpoint_name=agent.endpoint_name))

# Log the model
with mlflow.start_run():
    logged_agent_info = mlflow.pyfunc.log_model(
        name="agent",
        python_model="agent.py",
        resources=resources,
        pip_requirements=[
            f"databricks-connect=={get_distribution('databricks-connect').version}",
            f"mlflow=={get_distribution('mlflow').version}",
            f"databricks-langchain=={get_distribution('databricks-langchain').version}",
            f"langgraph=={get_distribution('langgraph').version}",
            f"langgraph-supervisor=={get_distribution('langgraph-supervisor').version}",
        ],
    )

print(f"‚úÖ Model logged successfully!")
print(f"Run ID: {logged_agent_info.run_id}")
print(f"Model URI: {logged_agent_info.model_uri}")


## Register to Unity Catalog


In [None]:
mlflow.set_registry_uri("databricks-uc")

# TODO: Update these with your catalog, schema, and model name
catalog = "akash_s_demo"
schema = "talent"
model_name = "mobility_attrition_with_summary"
UC_MODEL_NAME = f"{catalog}.{schema}.{model_name}"

# Register the model
uc_registered_model_info = mlflow.register_model(
    model_uri=logged_agent_info.model_uri, name=UC_MODEL_NAME
)

print(f"‚úÖ Model registered to Unity Catalog!")
print(f"Model: {UC_MODEL_NAME}")
print(f"Version: {uc_registered_model_info.version}")


## Deploy the Agent

Deploy the agent to a serving endpoint.


In [None]:
from databricks import agents

# Deploy the agent
deployment_info = agents.deploy(
    UC_MODEL_NAME, 
    uc_registered_model_info.version,
    tags={"enhanced": "with_summary"},
    deploy_feedback_model=False
)

print("\n" + "="*80)
print("üöÄ DEPLOYMENT INITIATED")
print("="*80)
print("\nYour agent with enhanced summarization is being deployed!")
print("\nüìä What to expect:")
print("  ‚Ä¢ Natural language summaries from Llama 3.1")
print("  ‚Ä¢ Structured tables from Genie")
print("  ‚Ä¢ Both in a single response")
print("\nThis deployment can take up to 15 minutes.")
print("\n" + "="*80)


## Example Output

### Question: "Give me attrition rates for each BU"

**What you'll get:**

```
Sales department has the highest attrition rate at 15.2%, significantly above the 8.1% company average.
Engineering maintains the strongest retention at 6.3%, indicating effective retention programs in technical roles.

| Department  | Attrition Rate | Employee Count | Avg Tenure |
|-------------|----------------|----------------|------------|
| Sales       | 15.2%          | 450            | 2.3 years  |
| Support     | 12.8%          | 320            | 2.8 years  |
| Marketing   | 10.5%          | 180            | 3.2 years  |
| Operations  | 9.2%           | 280            | 3.8 years  |
| Engineering | 6.3%           | 520            | 4.5 years  |
```

**In your Dash app, this will display as:**
- ‚úÖ **2-line summary** at the top (easy to read)
- ‚úÖ **Formatted table** below (with proper styling)
- ‚úÖ **Agent badge** showing which agent answered

## Key Features

‚úÖ **Concise** - Exactly 2 lines of summary, no fluff  
‚úÖ **Specific** - Uses actual numbers from the data  
‚úÖ **Complete** - Full table preserved for detailed analysis  
‚úÖ **Frontend Ready** - Dash app already parses and displays this format
