# Multi-Agent System with Genie + LLM Summarization

This notebook creates a multi-agent system where:
1. **Genie Agent** provides structured data (tables, statistics)
2. **Supervisor Agent** (Llama 3.1) creates natural language summaries
3. **Output includes BOTH** the table and the summary

## Prerequisites
- Genie Space created and configured
- Databricks serving endpoint access


In [None]:
%pip install -U -qqq langgraph-supervisor==0.0.30 mlflow[databricks] databricks-langchain databricks-agents databricks-ai-bridge uv 
dbutils.library.restartPython()


## Define the Multi-Agent System with Intelligent Routing

### Graph Architecture (Nodes, Edges, and Conditional Routing):

```
                    START
                      ‚Üì
            ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
            ‚îÇSupervisor Router ‚îÇ  ‚Üê Classifies question
            ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                     ‚îÇ
        ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
        ‚îÇ                         ‚îÇ
     TALENT                    OTHER
        ‚îÇ                         ‚îÇ
        ‚Üì                         ‚Üì
  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê            "No data available"
  ‚îÇ  Genie   ‚îÇ                   ‚îÇ
  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò                   ‚Üì
       ‚îÇ                        END
       ‚Üì
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇSupervisor       ‚îÇ  ‚Üê Creates summary + table
‚îÇSummarizer       ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
         ‚îÇ
         ‚Üì
        END
```

### Key Features:
- ‚úÖ **Intelligent routing**: Supervisor decides if question is talent-related
- ‚úÖ **Conditional logic**: Only calls Genie for talent questions
- ‚úÖ **Efficient**: Avoids unnecessary API calls for off-topic questions  
- ‚úÖ **Guaranteed format**: All Genie responses get 2-line summary + table


In [None]:
%%writefile agent.py
import json
from typing import Generator, Literal
from uuid import uuid4

import mlflow
from databricks_langchain import (
    ChatDatabricks,
    DatabricksFunctionClient,
    UCFunctionToolkit,
    set_uc_function_client,
)
from databricks_ai_bridge import ModelServingUserCredentials  # OBO authentication
from databricks.sdk import WorkspaceClient  # For Genie OBO with explicit credentials
from langchain_core.runnables import Runnable
from langchain.agents import create_agent
from langgraph.graph.state import CompiledStateGraph
from langgraph_supervisor import create_supervisor
from mlflow.pyfunc import ResponsesAgent
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
    output_to_responses_items_stream,
    to_chat_completions_input,
)
from pydantic import BaseModel

########################################
# Agent Configuration Models
########################################

GENIE = "genie"


class ServedSubAgent(BaseModel):
    endpoint_name: str
    name: str
    task: Literal["agent/v1/responses", "agent/v1/chat", "agent/v2/chat"]
    description: str


class Genie(BaseModel):
    space_id: str
    name: str
    task: str = GENIE
    description: str


class InCodeSubAgent(BaseModel):
    tools: list[str]
    name: str
    description: str


def stringify_content(state):
    """Convert content to string format for processing"""
    msgs = state["messages"]
    if isinstance(msgs[-1].content, list):
        msgs[-1].content = json.dumps(msgs[-1].content, indent=4)
    return {"messages": msgs}


def query_genie_with_obo(workspace_client: WorkspaceClient, space_id: str, question: str) -> str:
    """
    Query Genie Space using OBO credentials via direct API call.
    
    This bypasses the GenieAgent wrapper to ensure user credentials are used.
    
    Args:
        workspace_client: WorkspaceClient with OBO credentials
        space_id: Genie Space ID
        question: User's question
    
    Returns:
        Genie's response as string
    """
    import time
    
    try:
        # Start a conversation with Genie using OBO WorkspaceClient
        conversation = workspace_client.genie.start_conversation(
            space_id=space_id,
            content=question
        )
        
        conversation_id = conversation.conversation_id
        message_id = conversation.message_id
        
        # Poll for results
        max_wait = 60  # seconds
        start_time = time.time()
        
        while time.time() - start_time < max_wait:
            # Get message status
            message = workspace_client.genie.get_message(
                space_id=space_id,
                conversation_id=conversation_id,
                message_id=message_id
            )
            
            if message.status == "COMPLETED":
                # Get the response attachments
                if message.attachments:
                    # Genie returns data in attachments
                    response_parts = []
                    for attachment in message.attachments:
                        if hasattr(attachment, 'text') and attachment.text:
                            response_parts.append(attachment.text.content)
                        elif hasattr(attachment, 'query') and attachment.query:
                            # Include query results if available
                            if hasattr(attachment.query, 'result_data'):
                                response_parts.append(str(attachment.query.result_data))
                    
                    return "\n\n".join(response_parts) if response_parts else message.content
                
                return message.content or "No response from Genie"
            
            elif message.status == "FAILED":
                error_msg = f"Genie query failed: {message.error if hasattr(message, 'error') else 'Unknown error'}"
                print(f"ERROR: {error_msg}")
                return f"Error querying Genie: {error_msg}"
            
            # Still processing, wait and retry
            time.sleep(2)
        
        return "Genie query timed out after 60 seconds"
    
    except Exception as e:
        error_msg = f"Error calling Genie API: {str(e)}"
        print(f"ERROR: {error_msg}")
        return error_msg


########################################
# Create Custom LangGraph with Explicit Nodes and Edges
########################################

from langgraph.graph import StateGraph, END
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from typing import TypedDict, Annotated
import operator


# Define the state structure
class AgentState(TypedDict):
    messages: Annotated[list, operator.add]
    next_step: str  # Track routing decision


def create_langgraph_with_nodes(
    llm: Runnable,
    workspace_client: WorkspaceClient,  # OBO-enabled WorkspaceClient
    externally_served_agents: list[ServedSubAgent] = [],
):
    """
    Create a LangGraph with intelligent routing:
    - User Question ‚Üí Supervisor Router (decides if talent-related)
    - If YES ‚Üí Genie Node (gets data with OBO) ‚Üí Supervisor Summarizer (creates summary)
    - If NO ‚Üí Direct response (no data available)
    - Final Response ‚Üí END (returns to user)
    
    Args:
        llm: Language model for routing and summarization
        workspace_client: OBO-enabled WorkspaceClient for Genie API calls
        externally_served_agents: List of served agents (Genie)
    """
    
    # Find Genie space configuration
    genie_space_id = None
    genie_name = None
    genie_description = None
    
    for agent in externally_served_agents:
        if isinstance(agent, Genie):
            genie_space_id = agent.space_id
            genie_name = agent.name
            genie_description = agent.description
            break
    
    if not genie_space_id:
        raise ValueError("Genie agent configuration is required")
    
    # Define Supervisor Router node (decides if talent-related)
    def supervisor_router(state: AgentState):
        """Supervisor router - determines if question is about talent/workforce data"""
        messages = state["messages"]
        
        # Get the user's question
        user_question = ""
        for msg in messages:
            if hasattr(msg, 'content') and msg.content:
                if isinstance(msg, HumanMessage) or (hasattr(msg, 'role') and msg.role == 'user'):
                    user_question = msg.content
                    break
        
        print(f"DEBUG Router - User question: {user_question[:100]}")
        
        # Use LLM with few-shot examples for robust classification
        routing_prompt = f"""You are a routing assistant for a TALENT & WORKFORCE ANALYTICS chatbot.

This chatbot can ONLY answer questions about:
‚úì Workforce data (employees, headcount, demographics)
‚úì Organizational structure (departments, business units, teams, managers)  
‚úì Attrition & retention (turnover, exits, resignations, churn)
‚úì Employee mobility (promotions, transfers, career paths)
‚úì HR metrics (compensation, performance, tenure, reviews)
‚úì Workforce trends and analytics

The chatbot CANNOT answer questions about:
‚úó General knowledge, facts, or trivia
‚úó Current events, news, or weather
‚úó Products, services, or customer data (unless about employees)
‚úó Technical support or IT issues
‚úó Anything unrelated to employees/workforce

TASK: Classify if the following question can be answered with workforce/talent data.

Question: "{user_question}"

CLASSIFICATION EXAMPLES (learn the pattern):

"Which department has highest attrition rate?" ‚Üí TALENT (clearly about workforce)
"Give me BU level attrition details" ‚Üí TALENT (BU = business unit, workforce metric)
"Show me employee turnover" ‚Üí TALENT (employee metric)
"Attrition by team" ‚Üí TALENT (workforce analytics)
"What percentage of people left?" ‚Üí TALENT (people = employees)  
"Details about organizational structure" ‚Üí TALENT (org data)
"How many staff in each division?" ‚Üí TALENT (headcount query)
"Show me the data" ‚Üí TALENT (assume workforce data in this context)
"Give me details" ‚Üí TALENT (assume talent details in this context)
"What are the rates?" ‚Üí TALENT (likely workforce rates)
"What's the weather today?" ‚Üí OTHER (unrelated to workforce)
"Tell me about Python programming" ‚Üí OTHER (technical, not workforce)
"What is a good restaurant?" ‚Üí OTHER (unrelated to talent)

DECISION RULE:
- If the question could plausibly be asking for workforce/talent/organizational data ‚Üí TALENT
- If the question is clearly about a non-workforce topic ‚Üí OTHER
- When in doubt, choose TALENT (better to try and fail than miss valid questions)

Your classification (respond with ONLY the word "TALENT" or "OTHER"):"""
        
        routing_response = llm.invoke([HumanMessage(content=routing_prompt)])
        decision = routing_response.content.strip().upper()
        
        print(f"DEBUG Router - Decision: {decision}")
        
        if "TALENT" in decision:
            # Route to Genie
            return {"next_step": "genie", "messages": []}
        else:
            # Return message saying we don't have data
            response_msg = AIMessage(
                content="I'm specialized in talent and workforce analytics. I don't have information about that topic. Please ask me questions about attrition, employee mobility, retention, or workforce trends.",
                name="supervisor"
            )
            return {"next_step": "end", "messages": [response_msg]}
    
    # Define Genie node function (uses direct API with OBO)
    def genie_node(state: AgentState):
        """Genie node - queries data using OBO credentials via direct API"""
        messages = state["messages"]
        
        print(f"DEBUG Genie - Input messages: {len(messages)}")
        
        # Extract user's question from messages
        user_question = ""
        for msg in messages:
            if hasattr(msg, 'content') and msg.content:
                if isinstance(msg, HumanMessage) or (hasattr(msg, 'role') and msg.role == 'user'):
                    user_question = msg.content
                    break
        
        if not user_question:
            print("ERROR Genie - No user question found!")
            return {"messages": [AIMessage(content="Error: No question provided to Genie", name="genie")]}
        
        print(f"DEBUG Genie - Querying with: {user_question[:100]}")
        
        # Query Genie API directly with OBO credentials
        genie_response = query_genie_with_obo(
            workspace_client=workspace_client,
            space_id=genie_space_id,
            question=user_question
        )
        
        print(f"DEBUG Genie - Response length: {len(genie_response)}")
        print(f"DEBUG Genie - Response preview: {genie_response[:200]}")
        
        # Return as AIMessage
        genie_message = AIMessage(content=genie_response, name="genie")
        return {"messages": [genie_message]}
    
    # Helper function to clean pandas-formatted markdown tables
    def clean_pandas_table(text):
        """
        Remove pandas index column from markdown tables.
        Converts: |    | col1 | col2 |  ‚Üí  | col1 | col2 |
                  |---:|:-----|------|      |:-----|------|
                  |  0 | val1 | val2 |      | val1 | val2 |
        """
        import re
        
        lines = text.split('\n')
        cleaned_lines = []
        
        for line in lines:
            if '|' in line:
                # Split by pipe and strip whitespace
                cells = [cell.strip() for cell in line.split('|')]
                
                # Check if this is a table line (has multiple cells)
                if len(cells) >= 3:  # At least: ['', 'content', '']
                    # Remove leading/trailing empty cells
                    while cells and cells[0] == '':
                        cells.pop(0)
                    while cells and cells[-1] == '':
                        cells.pop()
                    
                    # Check if this is a separator line (only dashes, colons, spaces)
                    is_separator = cells and all(re.match(r'^[-:\s]+$', cell) for cell in cells)
                    
                    # Check if first cell is pandas index (empty, numeric, or separator marker)
                    if cells and (cells[0] == '' or 
                                  cells[0].isdigit() or 
                                  re.match(r'^\s*\d+\s*$', cells[0]) or
                                  (is_separator and re.match(r'^-+:?$', cells[0]))):
                        # Remove the first cell (pandas index)
                        cells = cells[1:]
                    
                    # Rebuild the line with clean cells
                    if cells:
                        cleaned_line = '| ' + ' | '.join(cells) + ' |'
                        cleaned_lines.append(cleaned_line)
                else:
                    cleaned_lines.append(line)
            else:
                cleaned_lines.append(line)
        
        return '\n'.join(cleaned_lines)
    
    # Define Supervisor Summarizer node (creates summary after Genie)
    def supervisor_summarizer(state: AgentState):
        """Supervisor summarizer - creates 2-line summary + preserves Genie's table"""
        messages = state["messages"]
        
        # Get ALL messages - find the one from Genie (should be AI message after user message)
        genie_response = ""
        
        # Look for the last AI message (from Genie)
        for msg in reversed(messages):
            if hasattr(msg, 'content') and msg.content:
                # Check if it's an AI message and has actual content
                if isinstance(msg, AIMessage) or (hasattr(msg, 'type') and msg.type == 'ai'):
                    content = str(msg.content)
                    # Skip if it's too short or empty
                    if content and len(content.strip()) > 10:
                        genie_response = content
                        break
        
        # Debug: Print what we got from Genie
        print(f"DEBUG - Messages count: {len(messages)}")
        print(f"DEBUG - Genie response length: {len(genie_response) if genie_response else 0}")
        if genie_response:
            print(f"DEBUG - Genie response preview: {genie_response[:200]}...")
        
        if not genie_response or len(genie_response.strip()) < 10:
            # If still no response, get the full state for debugging
            error_msg = f"No data received from Genie. Messages in state: {len(messages)}"
            print(f"ERROR: {error_msg}")
            for i, msg in enumerate(messages):
                print(f"  Message {i}: type={type(msg).__name__}, has_content={hasattr(msg, 'content')}")
                if hasattr(msg, 'content'):
                    content_preview = str(msg.content)[:100]
                    print(f"    Content preview: {content_preview}")
            return {"messages": [AIMessage(content=error_msg)]}
        
        # Check if Genie returned an error instead of data
        if any(error_keyword in genie_response for error_keyword in ["Error", "PERMISSION_DENIED", "FAILED", "failed with error"]):
            print(f"ERROR: Genie returned error: {genie_response[:200]}")
            error_msg = "I apologize, but I don't have access to the requested data. This may be due to data permissions or connectivity issues. Please contact your administrator if you believe you should have access to this information."
            return {"messages": [AIMessage(content=error_msg, name="supervisor_summarizer")]}
        
        # Clean up pandas-formatted tables (remove index column)
        genie_response = clean_pandas_table(genie_response)
        print(f"DEBUG - Cleaned genie response length: {len(genie_response)}")
        print(f"DEBUG - Cleaned genie response preview: {genie_response[:200]}...")
        
        # Create supervisor prompt - be VERY explicit (NO EXAMPLES to avoid hallucination)
        system_prompt = """You are a data analyst. Your job is to write a 2-line summary and include the original table.

OUTPUT FORMAT:
[Line 1: Key finding from the actual data provided]
[Line 2: Second insight from the actual data provided]

[PASTE THE COMPLETE ORIGINAL TABLE HERE EXACTLY AS PROVIDED]

CRITICAL RULES:
- Write EXACTLY 2 short lines analyzing ONLY the actual data provided below
- NEVER make up data or use example data
- Add blank line
- Copy the COMPLETE original table unchanged
- Use ONLY data from the table provided to you
- That's it - nothing else"""
        
        # Create messages for LLM with explicit instruction
        user_prompt = f"""Here is the data with a table:

{genie_response}

Instructions:
1. Write 2 lines summarizing the key findings
2. Include the complete table from above

Your response:"""
        
        supervisor_messages = [
            SystemMessage(content=system_prompt),
            HumanMessage(content=user_prompt)
        ]
        
        # Get summary from LLM
        summary_response = llm.invoke(supervisor_messages)
        
        # Combine summary with original table to ensure table is preserved
        final_response = summary_response.content
        
        # If the table isn't in the response, append it
        if '|' not in final_response and '|' in genie_response:
            print("DEBUG - Table not in LLM response, appending original table")
            final_response = f"{final_response}\n\n{genie_response}"
        
        print(f"DEBUG Supervisor - Final response length: {len(final_response)}")
        print(f"DEBUG Supervisor - Response preview: {final_response[:300]}")
        
        # Create a message with explicit ID to ensure it's unique
        summary_message = AIMessage(
            content=final_response,
            name="supervisor_summarizer",
            id=str(uuid4())  # Ensure unique ID
        )
        
        print(f"DEBUG Supervisor - Created message with ID: {summary_message.id}")
        
        # Return as a single message with the summary content
        return {"messages": [summary_message]}
    
    # Conditional edge function
    def route_after_supervisor(state: AgentState):
        """Route based on supervisor's decision"""
        next_step = state.get("next_step", "end")
        print(f"DEBUG Routing - Next step: {next_step}")
        
        if next_step == "genie":
            return "genie"
        else:
            return END
    
    # Build the graph with conditional routing
    workflow = StateGraph(AgentState)
    
    # Add nodes
    workflow.add_node("supervisor_router", supervisor_router)
    workflow.add_node("genie", genie_node)
    workflow.add_node("supervisor_summarizer", supervisor_summarizer)
    
    # Define edges with conditional routing
    # START ‚Üí Supervisor Router (decides if talent-related)
    workflow.set_entry_point("supervisor_router")
    
    # Supervisor Router ‚Üí Genie (if talent) OR END (if not)
    workflow.add_conditional_edges(
        "supervisor_router",
        route_after_supervisor,
        {
            "genie": "genie",
            END: END
        }
    )
    
    # Genie ‚Üí Supervisor Summarizer (ALWAYS)
    workflow.add_edge("genie", "supervisor_summarizer")
    
    # Supervisor Summarizer ‚Üí END (ALWAYS)
    workflow.add_edge("supervisor_summarizer", END)
    
    return workflow.compile()


##########################################
# Wrap LangGraph Supervisor as a ResponsesAgent with OBO
##########################################


class LangGraphResponsesAgent(ResponsesAgent):
    """
    ResponsesAgent that creates OBO-enabled resources PER REQUEST.
    
    CRITICAL: OBO resources (LLM, clients, agents) are initialized in predict/predict_stream,
    NOT in __init__, because user identity is only available at query time.
    Uses ModelServingUserCredentials for on-behalf-of authentication.
    """
    
    def __init__(self, llm_endpoint_name: str, externally_served_agents: list):
        """
        Store configuration only - NO OBO resource initialization here!
        
        Args:
            llm_endpoint_name: Name of the LLM serving endpoint
            externally_served_agents: List of agent configs (Genie, etc.)
        """
        self.llm_endpoint_name = llm_endpoint_name
        self.externally_served_agents = externally_served_agents
        print("‚úì LangGraphResponsesAgent initialized (config stored, OBO resources deferred)")

    def _create_graph_with_obo(self):
        """
        Create graph with OBO-enabled resources.
        
        Called inside predict/predict_stream where user identity is available.
        This ensures ModelServingUserCredentials() has access to the request context.
        
        CRITICAL: We create an OBO WorkspaceClient and pass it explicitly to the graph
        so Genie API calls use the user's credentials for RLS enforcement.
        """
        # Create OBO-enabled credentials strategy
        obo_creds = ModelServingUserCredentials()
        
        # Create OBO-enabled client for UC functions
        client = DatabricksFunctionClient(credentials_strategy=obo_creds)
        set_uc_function_client(client)
        
        # Create OBO-enabled LLM
        llm = ChatDatabricks(
            endpoint=self.llm_endpoint_name,
            credentials_strategy=obo_creds
        )
        
        # Create OBO-enabled WorkspaceClient for Genie API calls
        # This will be passed to the graph and used for direct Genie API calls
        workspace_client = WorkspaceClient(credentials_strategy=obo_creds)
        
        # Create the graph with OBO resources
        # The workspace_client ensures Genie queries use user credentials
        graph = create_langgraph_with_nodes(
            llm=llm,
            workspace_client=workspace_client,
            externally_served_agents=self.externally_served_agents
        )
        return graph

    def predict(self, request: ResponsesAgentRequest) -> ResponsesAgentResponse:
        """
        Predict method - creates OBO graph per request.
        
        User identity is available here via request context.
        """
        outputs = [
            event.item
            for event in self.predict_stream(request)
            if event.type == "response.output_item.done"
        ]
        return ResponsesAgentResponse(output=outputs, custom_outputs=request.custom_inputs)

    def predict_stream(
        self,
        request: ResponsesAgentRequest,
    ) -> Generator[ResponsesAgentStreamEvent, None, None]:
        """
        Streaming predict - creates OBO graph per request.
        
        User identity is available here via request context.
        """
        # Create OBO-enabled graph for THIS request with THIS user's credentials
        agent = self._create_graph_with_obo()
        
        cc_msgs = to_chat_completions_input([i.model_dump() for i in request.input])
        seen_ids = set()

        for _, events in agent.stream({"messages": cc_msgs}, stream_mode=["updates"]):
            node_name = tuple(events.keys())[0] if events else "unknown"
            
            print(f"DEBUG Stream - Node: {node_name}")
            
            # Get messages from this node
            new_msgs = []
            for v in events.values():
                msgs_in_update = v.get("messages", [])
                print(f"DEBUG Stream - Messages in update: {len(msgs_in_update)}")
                
                for msg in msgs_in_update:
                    if hasattr(msg, 'id') and msg.id not in seen_ids:
                        new_msgs.append(msg)
                        seen_ids.add(msg.id)
                        print(f"DEBUG Stream - Added message from {node_name}: {type(msg).__name__}")
            
            # ALWAYS emit node name tag when a node executes
            print(f"DEBUG Stream - Emitting tag for: {node_name}")
            yield ResponsesAgentStreamEvent(
                type="response.output_item.done",
                item=self.create_text_output_item(
                    text=f"<name>{node_name}</name>", id=str(uuid4())
                ),
            )
            
            # Emit the actual messages if any
            if new_msgs:
                print(f"DEBUG Stream - Emitting {len(new_msgs)} messages from {node_name}")
                yield from output_to_responses_items_stream(new_msgs)
            else:
                print(f"DEBUG Stream - No new messages to emit from {node_name}")


#######################################################
# Configuration (NO OBO resources initialized here!)
#######################################################

# Foundation model endpoint name (LLM initialized per-request with OBO)
LLM_ENDPOINT_NAME = "databricks-gpt-5-nano"

# Configure your Genie Space (agent created per-request with OBO)
EXTERNALLY_SERVED_AGENTS = [
    Genie(
        space_id="01f0c9f705201d14b364f5daf28bb639",  # TODO: Update with your Genie Space ID
        name="talent_genie",
        description="Analyzes talent stability, mobility patterns, attrition risk, and workforce trends. Provides structured data including statistics, tables, and detailed breakdowns by department, role, tenure, and other dimensions."
    ),
]

# Optional: Add UC function-calling agents
IN_CODE_AGENTS = []

# Tools for UC function calling (if any)
TOOLS = []

print("‚úì Agent configuration loaded (OBO resources will be created per-request)")

# Disable autolog to avoid permission issues with tracing
# mlflow.langchain.autolog()

AGENT = LangGraphResponsesAgent(LLM_ENDPOINT_NAME, EXTERNALLY_SERVED_AGENTS)
mlflow.models.set_model(AGENT)


## Visualize Graph Structure

Display the node and edge structure of the LangGraph.
Le

In [None]:
from IPython.display import Image, display

try:
    # For visualization, create a temporary graph without OBO
    # (OBO is only needed at query time, not for graph structure visualization)
    from databricks_langchain import ChatDatabricks
    from databricks.sdk import WorkspaceClient
    from agent import create_langgraph_with_nodes, LLM_ENDPOINT_NAME, EXTERNALLY_SERVED_AGENTS
    
    # Create non-OBO resources just for visualization
    temp_llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)
    temp_workspace_client = WorkspaceClient()  # Default credentials for visualization only
    temp_graph = create_langgraph_with_nodes(temp_llm, temp_workspace_client, EXTERNALLY_SERVED_AGENTS)
    
    graph_image = temp_graph.get_graph().draw_mermaid_png()
    display(Image(graph_image))
    print("‚úì Graph visualization displayed above")
except Exception as e:
    print(f"Could not generate graph image: {e}")
    print("\nGraph Structure (text):")
    print("=" * 80)
    print("                         START")
    print("                           ‚Üì")
    print("                 ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê")
    print("                 ‚îÇ supervisor_router   ‚îÇ  ‚Üê Decides: Is this talent-related?")
    print("                 ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò")
    print("                            ‚îÇ")
    print("              ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê")
    print("              ‚îÇ                           ‚îÇ")
    print("           TALENT                      OTHER")
    print("              ‚îÇ                           ‚îÇ")
    print("              ‚Üì                           ‚Üì")
    print("      ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê               \"No data\"")
    print("      ‚îÇ     genie     ‚îÇ                   ‚îÇ")
    print("      ‚îÇ               ‚îÇ                   ‚Üì")
    print("      ‚îÇ (Query data)  ‚îÇ                  END")
    print("      ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò")
    print("              ‚Üì")
    print("      ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê")
    print("      ‚îÇ supervisor_summarizer ‚îÇ  ‚Üê Creates 2-line summary + table")
    print("      ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò")
    print("                  ‚Üì")
    print("                 END")
    print("=" * 80)
    print("\nFlow Examples:")
    print("\n1. TALENT Question: 'Which department has highest attrition?'")
    print("   ‚Üí Router: TALENT ‚Üí Genie (gets table) ‚Üí Summarizer (adds summary) ‚Üí User")
    print("\n2. OTHER Question: 'What's the weather?'")
    print("   ‚Üí Router: OTHER ‚Üí 'I don't have that data' ‚Üí User")


## Test the Agent

Test the agent locally before deploying. You should see:
1. **Summary** from the supervisor (natural language insights)
2. **Table** from Genie (structured data)


In [None]:
dbutils.library.restartPython()


In [None]:
from agent import AGENT

# Test with a question that will require Genie to query data
input_example = {
    "input": [
        {"role": "user", "content": "Which department has the highest attrition rate?"}
    ]
}

# Get the response
response = AGENT.predict(input_example)
print(response)


In [None]:
# Test streaming to see the flow
print("=" * 80)
print("STREAMING OUTPUT (shows agent handoffs and responses)")
print("=" * 80)

for event in AGENT.predict_stream(input_example):
    output = event.model_dump(exclude_none=True)
    
    # Extract and display content
    if 'item' in output and 'content' in output['item']:
        for content_item in output['item']['content']:
            if 'text' in content_item:
                text = content_item['text']
                
                # Highlight agent names
                if text.startswith('<name>'):
                    print(f"\n{'='*60}")
                    print(f"‚ûú Agent: {text}")
                    print(f"{'='*60}\n")
                else:
                    print(text)


## Log the Agent to MLflow

Log the agent with automatic authentication for Databricks resources.


In [None]:
import mlflow
from agent import EXTERNALLY_SERVED_AGENTS, LLM_ENDPOINT_NAME, TOOLS, Genie
from databricks_langchain import UnityCatalogTool, VectorSearchRetrieverTool
from mlflow.models.resources import (
    DatabricksFunction,
    DatabricksGenieSpace,
    DatabricksServingEndpoint,
    DatabricksSQLWarehouse,
    DatabricksTable
)
from mlflow.models.auth_policy import AuthPolicy, SystemAuthPolicy, UserAuthPolicy
from pkg_resources import get_distribution

# Configure resources for SYSTEM authentication (service principal)
# IMPORTANT: For RLS to work, do NOT add tables or Genie Space here!
# Only add infrastructure resources that the service principal needs

resources = [
    DatabricksServingEndpoint(endpoint_name=LLM_ENDPOINT_NAME),  # LLM endpoint
    DatabricksSQLWarehouse(warehouse_id="148ccb90800933a1"),      # Warehouse infrastructure
]

# DO NOT add tables to system resources - tables accessed via user credentials only!
# Tables are accessed through Genie Space using OBO user credentials
# This ensures RLS is enforced based on the querying user's permissions

# Add UC function tools if any
for tool in TOOLS:
    if isinstance(tool, VectorSearchRetrieverTool):
        resources.extend(tool.resources)
    elif isinstance(tool, UnityCatalogTool):
        resources.append(DatabricksFunction(function_name=tool.uc_function_name))

# Add other served agents (but NOT Genie Space!)
for agent in EXTERNALLY_SERVED_AGENTS:
    if isinstance(agent, Genie):
        # DO NOT add Genie Space to system resources!
        # Genie must be accessed ONLY via user credentials for RLS to work
        # The service principal should NOT have access to Genie Space
        pass
    else:
        resources.append(DatabricksServingEndpoint(endpoint_name=agent.endpoint_name))

# Configure OBO authentication policies
# System auth policy: Agent authenticates to these resources automatically
# NOTE: Genie Space is intentionally NOT in system resources - it's user-only!
systemAuthPolicy = SystemAuthPolicy(resources=resources)

# User auth policy: Define API scopes for on-behalf-of user authentication
userAuthPolicy = UserAuthPolicy(
    api_scopes=[
        "serving.serving-endpoints",     # For LLM endpoint access
        "sql.warehouses",                # For SQL warehouse access
        "sql.statement-execution",       # For executing SQL queries on tables
        "dashboards.genie",              # For Genie Space access (CRITICAL for OBO)
    ]
)

# Log the model with OBO authentication
# Note: Don't pass resources separately - they're already in SystemAuthPolicy
with mlflow.start_run():
    logged_agent_info = mlflow.pyfunc.log_model(
        name="agent",
        python_model="agent.py",
        auth_policy=AuthPolicy(
            system_auth_policy=systemAuthPolicy,
            user_auth_policy=userAuthPolicy
        ),
        pip_requirements=[
            f"databricks-connect=={get_distribution('databricks-connect').version}",
            f"mlflow=={get_distribution('mlflow').version}",
            f"databricks-langchain=={get_distribution('databricks-langchain').version}",
            f"langgraph=={get_distribution('langgraph').version}",
            f"langgraph-supervisor=={get_distribution('langgraph-supervisor').version}",
            "databricks-ai-bridge",  # Required for OBO authentication
        ],
    )

print(f"‚úÖ Model logged successfully with OBO authentication!")
print(f"Run ID: {logged_agent_info.run_id}")
print(f"Model URI: {logged_agent_info.model_uri}")


## Register to Unity Catalog


In [None]:
mlflow.set_registry_uri("databricks-uc")

# TODO: Update these with your catalog, schema, and model name
catalog = "akash_s_demo"
schema = "talent"
model_name = "talent_agent_v1"
UC_MODEL_NAME = f"{catalog}.{schema}.{model_name}"

# Register the model
uc_registered_model_info = mlflow.register_model(
    model_uri=logged_agent_info.model_uri, name=UC_MODEL_NAME
)

print(f"‚úÖ Model registered to Unity Catalog!")
print(f"Model: {UC_MODEL_NAME}")
print(f"Version: {uc_registered_model_info.version}")


## Deploy the Agent

Deploy the agent to a serving endpoint.


In [None]:
from databricks import agents

# Deploy the agent
deployment_info = agents.deploy(
    UC_MODEL_NAME, 
    uc_registered_model_info.version,
    tags={"enhanced": "with_summary"},
    deploy_feedback_model=False
)

print("\n" + "="*80)
print("üöÄ DEPLOYMENT INITIATED")
print("="*80)
print("\nYour agent with enhanced summarization is being deployed!")
print("\nüìä What to expect:")
print("  ‚Ä¢ Natural language summaries from Llama 3.1")
print("  ‚Ä¢ Structured tables from Genie")
print("  ‚Ä¢ Both in a single response")
print("\nThis deployment can take up to 15 minutes.")
print("\n" + "="*80)


## Example Output

### Question: "Give me attrition rates for each BU"

**What you'll get:**

```
Sales department has the highest attrition rate at 15.2%, significantly above the 8.1% company average.
Engineering maintains the strongest retention at 6.3%, indicating effective retention programs in technical roles.

| Department  | Attrition Rate | Employee Count | Avg Tenure |
|-------------|----------------|----------------|------------|
| Sales       | 15.2%          | 450            | 2.3 years  |
| Support     | 12.8%          | 320            | 2.8 years  |
| Marketing   | 10.5%          | 180            | 3.2 years  |
| Operations  | 9.2%           | 280            | 3.8 years  |
| Engineering | 6.3%           | 520            | 4.5 years  |
```

**In your Dash app, this will display as:**
- ‚úÖ **2-line summary** at the top (easy to read)
- ‚úÖ **Formatted table** below (with proper styling)
- ‚úÖ **Agent badge** showing which agent answered

## Key Features

‚úÖ **Concise** - Exactly 2 lines of summary, no fluff  
‚úÖ **Specific** - Uses actual numbers from the data  
‚úÖ **Complete** - Full table preserved for detailed analysis  
‚úÖ **Frontend Ready** - Dash app already parses and displays this format
