# Day 4: Multi-Agent Communication & Handoffs in LangGraph

## 🎯 Learning Objectives
By the end of this session, you will:
- Master multi-agent communication patterns in LangGraph
- Implement agent handoffs using Command objects
- Build supervisor patterns with OpenAI for coordinating multiple agents
- Understand network communication and secure data injection
- Implement robust message validation between agents
- Design scalable multi-agent architectures

## ⏱️ Session Structure (2 hours)
- **Learning Materials** (30 min): Theory and communication patterns
- **Hands-on Code** (60 min): Implementation and examples  
- **Practical Exercises** (30 min): Build multi-agent systems

---

## 📖 Learning Materials (30 minutes)

### 📺 Video Resources
- [LangGraph Multi-Agent Guide](https://langchain-ai.github.io/langgraph/concepts/multi_agent/) - Official documentation
- [DeepLearning.AI - AI Agents in LangGraph](https://www.deeplearning.ai/short-courses/ai-agents-in-langgraph/) - Module 4: Multi-Agent Systems
- [LangChain Academy - Agent Communication](https://academy.langchain.com/) - Handoff patterns deep dive

### 🧠 Theory: Multi-Agent Communication

#### What are Multi-Agent Systems?
Multi-agent systems in LangGraph involve multiple specialized agents that collaborate to solve complex problems. Each agent has specific responsibilities and can communicate with others through structured handoffs.

**Key Concepts:**
- **Agent Handoffs**: Transferring control from one agent to another
- **Command Objects**: Structured way to pass control and data
- **Supervisor Patterns**: Central coordinator managing multiple worker agents
- **Network Communication**: Agents communicating across different processes/systems
- **Message Validation**: Ensuring data integrity and security

#### Communication Patterns

1. **Sequential Handoffs**: Agent A → Agent B → Agent C
2. **Supervisor-Worker**: Central supervisor delegates to specialized workers
3. **Peer-to-Peer**: Agents communicate directly with each other
4. **Hierarchical**: Multi-level agent structures
5. **Event-Driven**: Agents respond to events and messages

#### Command Objects
Commands are the primary mechanism for agent handoffs:
- **goto**: Transfer control to another node
- **update**: Modify state and continue
- **finish**: Complete the workflow

#### Security Considerations
- Input validation and sanitization
- Authentication between agents
- Secure data transmission
- Access control and permissions
- Audit logging for agent interactions

---
## 💻 Hands-on Code (60 minutes)

### Setup and Imports

In [None]:
# Install required packages
!pip install langgraph langchain langchain-openai pydantic python-dotenv
!pip install langgraph-checkpoint-sqlite httpx aiohttp
!pip install cryptography  # For secure communication

In [None]:
import os
import json
import asyncio
from typing import TypedDict, Literal, List, Optional, Dict, Any, Union
from pydantic import BaseModel, Field, validator
from enum import Enum
from datetime import datetime
from dotenv import load_dotenv

# LangGraph imports
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.types import Command
from langgraph.constants import INTERRUPT

# LangChain imports
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, BaseMessage, SystemMessage
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool

# Load environment variables
load_dotenv()

# Configure OpenAI
openai_api_key = os.getenv("OPENAI_API_KEY")
if not openai_api_key:
    print("⚠️ Please set OPENAI_API_KEY in your .env file")
    print("Example: OPENAI_API_KEY=sk-...")
else:
    print("✅ OpenAI API key loaded successfully")

# Initialize models
llm = ChatOpenAI(model="gpt-4", temperature=0, openai_api_key=openai_api_key)
fast_llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0, openai_api_key=openai_api_key)

### 1. Basic Command Objects and Handoffs

In [None]:
class AgentRole(str, Enum):
    """Define different agent roles"""
    COORDINATOR = "coordinator"
    RESEARCHER = "researcher"
    ANALYST = "analyst"
    WRITER = "writer"
    REVIEWER = "reviewer"

class MessageType(str, Enum):
    """Types of messages between agents"""
    TASK_REQUEST = "task_request"
    TASK_RESPONSE = "task_response"
    HANDOFF = "handoff"
    ERROR = "error"
    STATUS_UPDATE = "status_update"

class AgentMessage(BaseModel):
    """Structured message between agents"""
    from_agent: AgentRole
    to_agent: AgentRole
    message_type: MessageType
    content: str
    metadata: Dict[str, Any] = Field(default_factory=dict)
    timestamp: str = Field(default_factory=lambda: datetime.now().isoformat())
    message_id: str = Field(default_factory=lambda: f"msg_{datetime.now().timestamp()}")

class MultiAgentState(BaseModel):
    """State for multi-agent communication"""
    messages: List[BaseMessage] = Field(default_factory=list)
    agent_messages: List[AgentMessage] = Field(default_factory=list)
    current_agent: AgentRole = AgentRole.COORDINATOR
    task_description: str = ""
    research_data: Dict[str, Any] = Field(default_factory=dict)
    analysis_results: Dict[str, Any] = Field(default_factory=dict)
    final_output: str = ""
    workflow_status: str = "pending"
    error_log: List[str] = Field(default_factory=list)

# Basic handoff functions
def coordinator_node(state: MultiAgentState) -> Command:
    """Coordinator agent that manages the workflow"""
    print(f"🎯 Coordinator: Processing task - {state.task_description}")
    
    # Create handoff message
    handoff_msg = AgentMessage(
        from_agent=AgentRole.COORDINATOR,
        to_agent=AgentRole.RESEARCHER,
        message_type=MessageType.HANDOFF,
        content=f"Please research: {state.task_description}",
        metadata={"priority": "high", "deadline": "1 hour"}
    )
    
    state.agent_messages.append(handoff_msg)
    state.current_agent = AgentRole.RESEARCHER
    
    # Use Command to transfer control
    return Command(goto="researcher", update=state.model_dump())

def researcher_node(state: MultiAgentState) -> Command:
    """Research agent that gathers information"""
    print(f"🔍 Researcher: Gathering information about - {state.task_description}")
    
    # Simulate research using LLM
    research_prompt = f"""
    You are a research agent. Gather key information about: {state.task_description}
    Provide a structured research summary with key points and sources.
    """
    
    response = fast_llm.invoke([HumanMessage(content=research_prompt)])
    
    # Store research data
    state.research_data = {
        "summary": response.content,
        "completed_at": datetime.now().isoformat(),
        "agent": AgentRole.RESEARCHER
    }
    
    # Create handoff to analyst
    handoff_msg = AgentMessage(
        from_agent=AgentRole.RESEARCHER,
        to_agent=AgentRole.ANALYST,
        message_type=MessageType.HANDOFF,
        content="Research completed. Please analyze the findings.",
        metadata={"research_summary": response.content[:200]}
    )
    
    state.agent_messages.append(handoff_msg)
    state.current_agent = AgentRole.ANALYST
    
    return Command(goto="analyst", update=state.model_dump())

def analyst_node(state: MultiAgentState) -> Command:
    """Analyst agent that processes research data"""
    print(f"📊 Analyst: Analyzing research data")
    
    # Analyze the research data
    analysis_prompt = f"""
    You are an analyst agent. Analyze this research data and provide insights:
    
    Research Summary:
    {state.research_data.get('summary', 'No research data available')}
    
    Provide key insights, trends, and recommendations.
    """
    
    response = llm.invoke([HumanMessage(content=analysis_prompt)])
    
    # Store analysis results
    state.analysis_results = {
        "insights": response.content,
        "completed_at": datetime.now().isoformat(),
        "agent": AgentRole.ANALYST
    }
    
    # Create handoff to writer
    handoff_msg = AgentMessage(
        from_agent=AgentRole.ANALYST,
        to_agent=AgentRole.WRITER,
        message_type=MessageType.HANDOFF,
        content="Analysis completed. Please create final report.",
        metadata={"analysis_summary": response.content[:200]}
    )
    
    state.agent_messages.append(handoff_msg)
    state.current_agent = AgentRole.WRITER
    
    return Command(goto="writer", update=state.model_dump())

def writer_node(state: MultiAgentState) -> Command:
    """Writer agent that creates final output"""
    print(f"✍️ Writer: Creating final report")
    
    # Create final report
    writing_prompt = f"""
    You are a writer agent. Create a comprehensive report based on:
    
    Task: {state.task_description}
    
    Research Data:
    {state.research_data.get('summary', 'No research available')}
    
    Analysis:
    {state.analysis_results.get('insights', 'No analysis available')}
    
    Create a well-structured, professional report.
    """
    
    response = llm.invoke([HumanMessage(content=writing_prompt)])
    
    # Store final output
    state.final_output = response.content
    state.workflow_status = "completed"
    
    # Final status message
    status_msg = AgentMessage(
        from_agent=AgentRole.WRITER,
        to_agent=AgentRole.COORDINATOR,
        message_type=MessageType.STATUS_UPDATE,
        content="Final report completed successfully.",
        metadata={"output_length": len(response.content)}
    )
    
    state.agent_messages.append(status_msg)
    
    return Command(finish=state.model_dump())

print("🎭 Multi-agent nodes defined successfully")

### 2. Building the Multi-Agent Graph

In [None]:
def create_multi_agent_graph():
    """Create a multi-agent graph with handoffs"""
    
    # Create the graph
    graph = StateGraph(MultiAgentState)
    
    # Add agent nodes
    graph.add_node("coordinator", coordinator_node)
    graph.add_node("researcher", researcher_node)
    graph.add_node("analyst", analyst_node)
    graph.add_node("writer", writer_node)
    
    # Add edges (Command objects handle the actual routing)
    graph.add_edge(START, "coordinator")
    
    # Compile with persistence
    saver = SqliteSaver.from_conn_string("multi_agent.db")
    app = graph.compile(checkpointer=saver)
    
    return app

# Create and test the multi-agent system
multi_agent_app = create_multi_agent_graph()
print("🏗️ Multi-agent graph created successfully")

# Test with a sample task
initial_state = MultiAgentState(
    task_description="Analyze the impact of AI on software development workflows",
    messages=[HumanMessage(content="Please analyze the impact of AI on software development workflows")]
)

config = {"configurable": {"thread_id": "multi-agent-test"}}

print("\n🚀 Running multi-agent workflow...")
result = multi_agent_app.invoke(initial_state, config=config)

print(f"\n📊 Workflow Status: {result.workflow_status}")
print(f"💬 Agent Messages: {len(result.agent_messages)}")
print(f"📝 Final Output Length: {len(result.final_output)} characters")
print(f"\n🎯 Final Report Preview: {result.final_output[:300]}...")

### 3. Supervisor Pattern with OpenAI

In [None]:
class SupervisorState(BaseModel):
    """State for supervisor pattern"""
    messages: List[BaseMessage] = Field(default_factory=list)
    task_queue: List[Dict[str, Any]] = Field(default_factory=list)
    worker_status: Dict[str, str] = Field(default_factory=dict)
    completed_tasks: List[Dict[str, Any]] = Field(default_factory=list)
    current_task: Optional[Dict[str, Any]] = None
    supervisor_decisions: List[str] = Field(default_factory=list)

# Define specialized worker agents
def web_researcher_worker(state: SupervisorState) -> Command:
    """Specialized worker for web research tasks"""
    print("🌐 Web Researcher: Starting web research task")
    
    if state.current_task:
        task = state.current_task
        
        # Simulate web research
        research_prompt = f"""
        You are a web research specialist. Research the following topic:
        {task.get('description', '')}
        
        Focus on finding recent developments, key statistics, and authoritative sources.
        Provide a structured summary with source citations.
        """
        
        response = fast_llm.invoke([HumanMessage(content=research_prompt)])
        
        # Complete the task
        completed_task = {
            **task,
            "result": response.content,
            "completed_by": "web_researcher",
            "completed_at": datetime.now().isoformat()
        }
        
        state.completed_tasks.append(completed_task)
        state.worker_status["web_researcher"] = "completed"
        state.current_task = None
    
    return Command(goto="supervisor", update=state.model_dump())

def data_analyst_worker(state: SupervisorState) -> Command:
    """Specialized worker for data analysis tasks"""
    print("📈 Data Analyst: Starting data analysis task")
    
    if state.current_task:
        task = state.current_task
        
        # Simulate data analysis
        analysis_prompt = f"""
        You are a data analysis specialist. Analyze the following:
        {task.get('description', '')}
        
        Focus on identifying patterns, trends, and statistical insights.
        Provide quantitative analysis with supporting evidence.
        """
        
        response = llm.invoke([HumanMessage(content=analysis_prompt)])
        
        # Complete the task
        completed_task = {
            **task,
            "result": response.content,
            "completed_by": "data_analyst",
            "completed_at": datetime.now().isoformat()
        }
        
        state.completed_tasks.append(completed_task)
        state.worker_status["data_analyst"] = "completed"
        state.current_task = None
    
    return Command(goto="supervisor", update=state.model_dump())

def content_writer_worker(state: SupervisorState) -> Command:
    """Specialized worker for content writing tasks"""
    print("✍️ Content Writer: Starting content creation task")
    
    if state.current_task:
        task = state.current_task
        
        # Simulate content writing
        writing_prompt = f"""
        You are a professional content writer. Create content for:
        {task.get('description', '')}
        
        Focus on clarity, engagement, and professional tone.
        Structure the content with appropriate headings and sections.
        """
        
        response = llm.invoke([HumanMessage(content=writing_prompt)])
        
        # Complete the task
        completed_task = {
            **task,
            "result": response.content,
            "completed_by": "content_writer",
            "completed_at": datetime.now().isoformat()
        }
        
        state.completed_tasks.append(completed_task)
        state.worker_status["content_writer"] = "completed"
        state.current_task = None
    
    return Command(goto="supervisor", update=state.model_dump())

def supervisor_node(state: SupervisorState) -> Command:
    """Supervisor that delegates tasks to appropriate workers"""
    print("👔 Supervisor: Managing task delegation")
    
    # If no current task and tasks in queue, assign next task
    if not state.current_task and state.task_queue:
        next_task = state.task_queue.pop(0)
        state.current_task = next_task
        
        # Use OpenAI to determine best worker for the task
        delegation_prompt = f"""
        You are a supervisor managing specialized workers. Determine which worker is best suited for this task:
        
        Task: {next_task.get('description', '')}
        Task Type: {next_task.get('type', 'general')}
        
        Available workers:
        1. web_researcher - Specializes in web research and information gathering
        2. data_analyst - Specializes in data analysis and statistical insights
        3. content_writer - Specializes in content creation and writing
        
        Respond with only the worker name (web_researcher, data_analyst, or content_writer).
        """
        
        response = fast_llm.invoke([HumanMessage(content=delegation_prompt)])
        selected_worker = response.content.strip().lower()
        
        # Validate worker selection
        valid_workers = ["web_researcher", "data_analyst", "content_writer"]
        if selected_worker not in valid_workers:
            selected_worker = "web_researcher"  # Default fallback
        
        state.supervisor_decisions.append(f"Assigned task '{next_task['description'][:50]}...' to {selected_worker}")
        state.worker_status[selected_worker] = "working"
        
        print(f"📋 Supervisor: Delegating to {selected_worker}")
        return Command(goto=selected_worker, update=state.model_dump())
    
    # If all tasks completed, finish
    elif not state.task_queue and not state.current_task:
        print("✅ Supervisor: All tasks completed")
        return Command(finish=state.model_dump())
    
    # Continue supervising
    else:
        print("⏳ Supervisor: Waiting for task completion")
        return Command(goto="supervisor", update=state.model_dump())

print("👔 Supervisor pattern nodes defined successfully")

### 4. Building and Testing the Supervisor System

In [None]:
def create_supervisor_graph():
    """Create supervisor-worker graph"""
    
    graph = StateGraph(SupervisorState)
    
    # Add supervisor and worker nodes
    graph.add_node("supervisor", supervisor_node)
    graph.add_node("web_researcher", web_researcher_worker)
    graph.add_node("data_analyst", data_analyst_worker)
    graph.add_node("content_writer", content_writer_worker)
    
    # Start with supervisor
    graph.add_edge(START, "supervisor")
    
    # Compile with persistence
    saver = SqliteSaver.from_conn_string("supervisor.db")
    app = graph.compile(checkpointer=saver)
    
    return app

# Create supervisor system
supervisor_app = create_supervisor_graph()
print("🏭 Supervisor system created successfully")

# Create test tasks
test_tasks = [
    {
        "id": "task_1",
        "description": "Research the latest trends in machine learning for 2025",
        "type": "research",
        "priority": "high"
    },
    {
        "id": "task_2",
        "description": "Analyze user engagement data from the past quarter",
        "type": "analysis",
        "priority": "medium"
    },
    {
        "id": "task_3",
        "description": "Write a blog post about AI safety best practices",
        "type": "content",
        "priority": "medium"
    }
]

# Test supervisor system
supervisor_state = SupervisorState(
    task_queue=test_tasks,
    worker_status={worker: "idle" for worker in ["web_researcher", "data_analyst", "content_writer"]}
)

supervisor_config = {"configurable": {"thread_id": "supervisor-test"}}

print("\n🚀 Running supervisor workflow...")
supervisor_result = supervisor_app.invoke(supervisor_state, config=supervisor_config)

print(f"\n📊 Supervisor Results:")
print(f"✅ Completed Tasks: {len(supervisor_result.completed_tasks)}")
print(f"📋 Supervisor Decisions: {len(supervisor_result.supervisor_decisions)}")
print(f"👥 Worker Status: {supervisor_result.worker_status}")

# Show completed tasks
for i, task in enumerate(supervisor_result.completed_tasks):
    print(f"\n📝 Task {i+1}: {task['description'][:50]}...")
    print(f"   Completed by: {task['completed_by']}")
    print(f"   Result preview: {task['result'][:100]}...")

### 5. Network Communication and Secure Data Injection

In [None]:
import hashlib
import hmac
import base64
from cryptography.fernet import Fernet

class SecureMessage(BaseModel):
    """Secure message with validation and encryption"""
    sender_id: str
    receiver_id: str
    message_type: str
    payload: str  # Encrypted content
    signature: str
    timestamp: str
    nonce: str = Field(default_factory=lambda: os.urandom(16).hex())
    
    @validator('timestamp')
    def validate_timestamp(cls, v):
        try:
            datetime.fromisoformat(v)
            return v
        except ValueError:
            raise ValueError("Invalid timestamp format")

class SecureAgentCommunication:
    """Handles secure communication between agents"""
    
    def __init__(self, secret_key: str):
        self.secret_key = secret_key.encode()
        # Generate encryption key from secret
        key = base64.urlsafe_b64encode(hashlib.sha256(self.secret_key).digest())
        self.cipher = Fernet(key)
        
    def create_secure_message(self, sender_id: str, receiver_id: str, 
                            message_type: str, content: str) -> SecureMessage:
        """Create a secure, signed message"""
        
        # Encrypt the content
        encrypted_content = self.cipher.encrypt(content.encode())
        payload = base64.b64encode(encrypted_content).decode()
        
        # Create timestamp
        timestamp = datetime.now().isoformat()
        
        # Create message data for signing
        message_data = f"{sender_id}:{receiver_id}:{message_type}:{payload}:{timestamp}"
        
        # Create HMAC signature
        signature = hmac.new(
            self.secret_key,
            message_data.encode(),
            hashlib.sha256
        ).hexdigest()
        
        return SecureMessage(
            sender_id=sender_id,
            receiver_id=receiver_id,
            message_type=message_type,
            payload=payload,
            signature=signature,
            timestamp=timestamp
        )
    
    def verify_and_decrypt_message(self, message: SecureMessage) -> tuple[bool, str]:
        """Verify message signature and decrypt content"""
        
        # Recreate message data for verification
        message_data = f"{message.sender_id}:{message.receiver_id}:{message.message_type}:{message.payload}:{message.timestamp}"
        
        # Verify signature
        expected_signature = hmac.new(
            self.secret_key,
            message_data.encode(),
            hashlib.sha256
        ).hexdigest()
        
        if not hmac.compare_digest(expected_signature, message.signature):
            return False, "Invalid signature"
        
        # Check timestamp (prevent replay attacks)
        msg_time = datetime.fromisoformat(message.timestamp)
        current_time = datetime.now()
        if (current_time - msg_time).total_seconds() > 3600:  # 1 hour expiry
            return False, "Message expired"
        
        # Decrypt content
        try:
            encrypted_content = base64.b64decode(message.payload)
            decrypted_content = self.cipher.decrypt(encrypted_content).decode()
            return True, decrypted_content
        except Exception as e:
            return False, f"Decryption failed: {str(e)}"

class NetworkAgentState(BaseModel):
    """State for network-based agent communication"""
    messages: List[BaseMessage] = Field(default_factory=list)
    secure_messages: List[SecureMessage] = Field(default_factory=list)
    agent_id: str
    trusted_agents: List[str] = Field(default_factory=list)
    message_log: List[Dict[str, Any]] = Field(default_factory=list)
    security_violations: List[str] = Field(default_factory=list)

# Create secure communication instance
secure_comm = SecureAgentCommunication("my_secret_key_12345")

def secure_sender_agent(state: NetworkAgentState) -> Command:
    """Agent that sends secure messages"""
    print(f"📡 Secure Sender ({state.agent_id}): Preparing secure message")
    
    # Create secure message
    content = "This is a confidential message containing sensitive agent data."
    secure_msg = secure_comm.create_secure_message(
        sender_id=state.agent_id,
        receiver_id="receiver_agent",
        message_type="task_data",
        content=content
    )
    
    state.secure_messages.append(secure_msg)
    state.message_log.append({
        "action": "message_sent",
        "to": "receiver_agent",
        "timestamp": datetime.now().isoformat(),
        "message_id": secure_msg.nonce
    })
    
    print(f"✅ Secure message sent with signature: {secure_msg.signature[:20]}...")
    
    return Command(goto="receiver", update=state.model_dump())

def secure_receiver_agent(state: NetworkAgentState) -> Command:
    """Agent that receives and validates secure messages"""
    print(f"📨 Secure Receiver: Processing incoming messages")
    
    for secure_msg in state.secure_messages:
        if secure_msg.receiver_id == "receiver_agent":
            # Verify and decrypt message
            is_valid, content = secure_comm.verify_and_decrypt_message(secure_msg)
            
            if is_valid:
                print(f"✅ Message verified and decrypted successfully")
                print(f"📄 Content: {content[:50]}...")
                
                # Log successful message processing
                state.message_log.append({
                    "action": "message_received",
                    "from": secure_msg.sender_id,
                    "timestamp": datetime.now().isoformat(),
                    "status": "verified",
                    "content_preview": content[:50]
                })
                
                # Process the secure content with LLM
                processing_prompt = f"""
                You received a secure message from another agent:
                {content}
                
                Generate an appropriate response acknowledging the message.
                """
                
                response = fast_llm.invoke([HumanMessage(content=processing_prompt)])
                state.messages.append(response)
                
            else:
                print(f"❌ Message verification failed: {content}")
                state.security_violations.append(f"Invalid message from {secure_msg.sender_id}: {content}")
    
    return Command(finish=state.model_dump())

print("🔐 Secure communication agents defined successfully")

### 6. Testing Secure Communication

In [None]:
def create_secure_communication_graph():
    """Create graph for secure agent communication"""
    
    graph = StateGraph(NetworkAgentState)
    
    graph.add_node("sender", secure_sender_agent)
    graph.add_node("receiver", secure_receiver_agent)
    
    graph.add_edge(START, "sender")
    
    # Compile with persistence
    saver = SqliteSaver.from_conn_string("secure_comm.db")
    app = graph.compile(checkpointer=saver)
    
    return app

# Test secure communication
secure_app = create_secure_communication_graph()
print("🔐 Secure communication graph created")

# Create test state
secure_state = NetworkAgentState(
    agent_id="sender_agent",
    trusted_agents=["receiver_agent"]
)

secure_config = {"configurable": {"thread_id": "secure-comm-test"}}

print("\n🚀 Testing secure communication...")
secure_result = secure_app.invoke(secure_state, config=secure_config)

print(f"\n📊 Secure Communication Results:")
print(f"📧 Secure Messages: {len(secure_result.secure_messages)}")
print(f"📝 Message Log Entries: {len(secure_result.message_log)}")
print(f"⚠️ Security Violations: {len(secure_result.security_violations)}")
print(f"💬 Generated Responses: {len(secure_result.messages)}")

# Show message log
for log_entry in secure_result.message_log:
    print(f"\n📋 {log_entry['action']}: {log_entry.get('status', 'N/A')}")
    if 'content_preview' in log_entry:
        print(f"   Preview: {log_entry['content_preview']}...")

### 7. Message Validation and Error Handling

In [None]:
class MessageValidator:
    """Validates messages between agents"""
    
    @staticmethod
    def validate_agent_message(message: AgentMessage) -> tuple[bool, List[str]]:
        """Validate agent message structure and content"""
        errors = []
        
        # Check required fields
        if not message.from_agent:
            errors.append("Missing from_agent")
        if not message.to_agent:
            errors.append("Missing to_agent")
        if not message.content:
            errors.append("Missing content")
        
        # Validate content length
        if len(message.content) > 10000:
            errors.append("Content too long (max 10000 characters)")
        
        # Check for malicious content patterns
        malicious_patterns = [
            "<script", "javascript:", "eval(", "exec(",
            "system(", "os.system", "subprocess", "__import__"
        ]
        
        content_lower = message.content.lower()
        for pattern in malicious_patterns:
            if pattern in content_lower:
                errors.append(f"Potentially malicious content detected: {pattern}")
        
        # Validate timestamp
        try:
            msg_time = datetime.fromisoformat(message.timestamp)
            current_time = datetime.now()
            if msg_time > current_time:
                errors.append("Message timestamp is in the future")
        except ValueError:
            errors.append("Invalid timestamp format")
        
        return len(errors) == 0, errors
    
    @staticmethod
    def sanitize_content(content: str) -> str:
        """Sanitize message content"""
        # Remove potentially dangerous characters
        import re
        # Remove HTML tags
        content = re.sub(r'<[^>]+>', '', content)
        # Remove script-like patterns
        content = re.sub(r'(javascript:|data:text/html)', '', content, flags=re.IGNORECASE)
        # Limit length
        content = content[:5000]
        return content.strip()

class ValidatedAgentState(BaseModel):
    """State with message validation"""
    messages: List[BaseMessage] = Field(default_factory=list)
    agent_messages: List[AgentMessage] = Field(default_factory=list)
    validation_log: List[Dict[str, Any]] = Field(default_factory=list)
    rejected_messages: List[Dict[str, Any]] = Field(default_factory=list)
    current_agent: str = "validator"
    task_status: str = "pending"

def message_validator_node(state: ValidatedAgentState) -> Command:
    """Validates incoming messages"""
    print("🔍 Message Validator: Checking message integrity")
    
    validator = MessageValidator()
    
    # Create test messages to validate
    test_messages = [
        AgentMessage(
            from_agent=AgentRole.COORDINATOR,
            to_agent=AgentRole.RESEARCHER,
            message_type=MessageType.TASK_REQUEST,
            content="Please research AI safety protocols"
        ),
        AgentMessage(
            from_agent=AgentRole.RESEARCHER,
            to_agent=AgentRole.ANALYST,
            message_type=MessageType.TASK_RESPONSE,
            content="<script>alert('malicious')</script>Research completed"
        )
    ]
    
    for msg in test_messages:
        is_valid, errors = validator.validate_agent_message(msg)
        
        validation_entry = {
            "message_id": msg.message_id,
            "from_agent": msg.from_agent,
            "to_agent": msg.to_agent,
            "is_valid": is_valid,
            "errors": errors,
            "timestamp": datetime.now().isoformat()
        }
        
        state.validation_log.append(validation_entry)
        
        if is_valid:
            # Sanitize and add to valid messages
            msg.content = validator.sanitize_content(msg.content)
            state.agent_messages.append(msg)
            print(f"✅ Message validated: {msg.message_id[:8]}...")
        else:
            # Reject invalid message
            state.rejected_messages.append({
                "message": msg.model_dump(),
                "errors": errors,
                "rejected_at": datetime.now().isoformat()
            })
            print(f"❌ Message rejected: {', '.join(errors)}")
    
    state.task_status = "validation_complete"
    
    return Command(goto="processor", update=state.model_dump())

def secure_processor_node(state: ValidatedAgentState) -> Command:
    """Processes validated messages securely"""
    print("🛡️ Secure Processor: Processing validated messages")
    
    valid_messages_count = len(state.agent_messages)
    rejected_messages_count = len(state.rejected_messages)
    
    # Process valid messages with OpenAI
    if state.agent_messages:
        messages_summary = "\n".join([f"- {msg.content[:100]}..." for msg in state.agent_messages])
        
        processing_prompt = f"""
        You are processing validated inter-agent messages. Summary of {valid_messages_count} valid messages:
        
        {messages_summary}
        
        Generate a summary of the communication flow and any insights.
        """
        
        response = llm.invoke([HumanMessage(content=processing_prompt)])
        state.messages.append(response)
    
    state.task_status = "completed"
    
    print(f"📊 Processed {valid_messages_count} valid messages, rejected {rejected_messages_count}")
    
    return Command(finish=state.model_dump())

# Create validation graph
def create_validation_graph():
    """Create message validation graph"""
    
    graph = StateGraph(ValidatedAgentState)
    
    graph.add_node("validator", message_validator_node)
    graph.add_node("processor", secure_processor_node)
    
    graph.add_edge(START, "validator")
    
    # Compile with persistence
    saver = SqliteSaver.from_conn_string("validation.db")
    app = graph.compile(checkpointer=saver)
    
    return app

# Test message validation
validation_app = create_validation_graph()
print("\n🔍 Message validation graph created")

validation_state = ValidatedAgentState()
validation_config = {"configurable": {"thread_id": "validation-test"}}

print("\n🚀 Testing message validation...")
validation_result = validation_app.invoke(validation_state, config=validation_config)

print(f"\n📊 Validation Results:")
print(f"✅ Valid Messages: {len(validation_result.agent_messages)}")
print(f"❌ Rejected Messages: {len(validation_result.rejected_messages)}")
print(f"📝 Validation Log Entries: {len(validation_result.validation_log)}")

# Show validation details
for log_entry in validation_result.validation_log:
    status = "✅ VALID" if log_entry['is_valid'] else "❌ INVALID"
    print(f"\n{status} - {log_entry['from_agent']} → {log_entry['to_agent']}")
    if log_entry['errors']:
        print(f"   Errors: {', '.join(log_entry['errors'])}")

print("\n🔐 Message validation and security testing completed")

---
## 🛠️ Practical Exercises (30 minutes)

### Exercise 1: Build a Customer Support Multi-Agent System
**Goal**: Create a customer support system with multiple specialized agents.

**Requirements**:
- Initial classifier agent that routes tickets
- Technical support agent for technical issues
- Billing agent for payment issues
- Escalation agent for complex problems
- Use Command objects for handoffs
- Implement proper message validation

In [None]:
# Exercise 1: Your implementation here
class SupportTicket(BaseModel):
    """Support ticket data structure"""
    # TODO: Define your support ticket schema
    pass

class SupportAgentState(BaseModel):
    """State for customer support system"""
    # TODO: Define your support system state
    pass

def ticket_classifier_agent(state: SupportAgentState) -> Command:
    """Classifies and routes support tickets"""
    # TODO: Implement ticket classification logic
    pass

def technical_support_agent(state: SupportAgentState) -> Command:
    """Handles technical support issues"""
    # TODO: Implement technical support logic
    pass

# TODO: Create and test your customer support system
print("🎫 Exercise 1: Implement your customer support multi-agent system here")

### Exercise 2: Implement a Distributed Task Processing System
**Goal**: Build a system where multiple worker agents process tasks in parallel.

**Requirements**:
- Task dispatcher that distributes work
- Multiple worker agents that can run in parallel
- Results aggregator that combines outputs
- Implement secure communication between agents
- Handle worker failures and task redistribution

In [None]:
# Exercise 2: Your implementation here
class DistributedTask(BaseModel):
    """Task for distributed processing"""
    # TODO: Define your task structure
    pass

class DistributedSystemState(BaseModel):
    """State for distributed task processing"""
    # TODO: Define your distributed system state
    pass

def task_dispatcher_agent(state: DistributedSystemState) -> Command:
    """Dispatches tasks to available workers"""
    # TODO: Implement task dispatch logic
    pass

def parallel_worker_agent(state: DistributedSystemState) -> Command:
    """Processes tasks in parallel"""
    # TODO: Implement parallel processing logic
    pass

# TODO: Create and test your distributed processing system
print("🔄 Exercise 2: Implement your distributed task processing system here")

### Challenge: Create an AI Research Collaboration Network
**Goal**: Build an advanced multi-agent system that simulates AI research collaboration.

**Advanced Requirements**:
- Research proposal agent that generates research ideas
- Peer review agents that evaluate proposals
- Collaboration matching agent that pairs researchers
- Progress tracking and milestone management
- Secure sharing of research data
- Dynamic agent roles and permissions
- Cross-network communication protocols

In [None]:
# Challenge: Your implementation here
class ResearchProposal(BaseModel):
    """Research proposal structure"""
    # TODO: Design comprehensive research proposal schema
    pass

class ResearchNetworkState(BaseModel):
    """State for research collaboration network"""
    # TODO: Design complex research network state
    pass

# TODO: Implement your AI research collaboration network
print("🎯 Challenge: Build your AI research collaboration network here")
print("💡 Hint: Consider research lifecycle, peer review, and collaboration dynamics")

---
## 📚 Solutions and Best Practices

### Exercise 1 Solution: Customer Support System

In [None]:
# Complete solution for Exercise 1
class TicketPriority(str, Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    URGENT = "urgent"

class TicketCategory(str, Enum):
    TECHNICAL = "technical"
    BILLING = "billing"
    GENERAL = "general"
    ESCALATION = "escalation"

class SupportTicket(BaseModel):
    ticket_id: str
    customer_id: str
    title: str
    description: str
    category: Optional[TicketCategory] = None
    priority: TicketPriority = TicketPriority.MEDIUM
    status: str = "open"
    assigned_agent: Optional[str] = None
    resolution: Optional[str] = None
    created_at: str = Field(default_factory=lambda: datetime.now().isoformat())

class SupportAgentState(BaseModel):
    messages: List[BaseMessage] = Field(default_factory=list)
    tickets: List[SupportTicket] = Field(default_factory=list)
    current_ticket: Optional[SupportTicket] = None
    agent_workload: Dict[str, int] = Field(default_factory=dict)
    escalation_log: List[Dict[str, Any]] = Field(default_factory=list)

def ticket_classifier_agent(state: SupportAgentState) -> Command:
    """Classifies and routes support tickets"""
    print("🎫 Ticket Classifier: Analyzing incoming ticket")
    
    if state.current_ticket:
        ticket = state.current_ticket
        
        # Use OpenAI to classify the ticket
        classification_prompt = f"""
        Classify this support ticket into one of these categories:
        - technical: Technical issues, bugs, software problems
        - billing: Payment issues, subscription problems, refunds
        - general: General inquiries, feature requests, information
        - escalation: Complex issues requiring senior support
        
        Ticket: {ticket.title}
        Description: {ticket.description}
        
        Respond with only the category name.
        """
        
        response = fast_llm.invoke([HumanMessage(content=classification_prompt)])
        category = response.content.strip().lower()
        
        # Validate and assign category
        if category in ["technical", "billing", "general", "escalation"]:
            ticket.category = TicketCategory(category)
        else:
            ticket.category = TicketCategory.GENERAL
        
        print(f"📋 Ticket classified as: {ticket.category}")
        
        # Route to appropriate agent
        if ticket.category == TicketCategory.TECHNICAL:
            return Command(goto="technical_agent", update=state.model_dump())
        elif ticket.category == TicketCategory.BILLING:
            return Command(goto="billing_agent", update=state.model_dump())
        else:
            return Command(goto="general_agent", update=state.model_dump())
    
    return Command(finish=state.model_dump())

def technical_support_agent(state: SupportAgentState) -> Command:
    """Handles technical support issues"""
    print("🔧 Technical Support: Processing technical issue")
    
    if state.current_ticket:
        ticket = state.current_ticket
        
        # Generate technical solution
        tech_prompt = f"""
        You are a technical support specialist. Provide a solution for:
        
        Issue: {ticket.title}
        Details: {ticket.description}
        
        Provide step-by-step troubleshooting instructions.
        """
        
        response = llm.invoke([HumanMessage(content=tech_prompt)])
        
        ticket.resolution = response.content
        ticket.assigned_agent = "technical_support"
        ticket.status = "resolved"
        
        state.messages.append(response)
        print(f"✅ Technical issue resolved")
    
    return Command(finish=state.model_dump())

# Create support system
def create_support_system():
    graph = StateGraph(SupportAgentState)
    
    graph.add_node("classifier", ticket_classifier_agent)
    graph.add_node("technical_agent", technical_support_agent)
    # Add other agents...
    
    graph.add_edge(START, "classifier")
    
    saver = SqliteSaver.from_conn_string("support_system.db")
    return graph.compile(checkpointer=saver)

# Test the support system
support_app = create_support_system()

test_ticket = SupportTicket(
    ticket_id="TKT-001",
    customer_id="CUST-123",
    title="Application crashes on startup",
    description="The application crashes immediately when I try to open it. Error message shows 'Memory access violation'."
)

support_state = SupportAgentState(current_ticket=test_ticket)
support_config = {"configurable": {"thread_id": "support-test"}}

print("\n🎫 Testing customer support system...")
support_result = support_app.invoke(support_state, config=support_config)

if support_result.current_ticket:
    ticket = support_result.current_ticket
    print(f"✅ Support System Solution: Ticket {ticket.ticket_id}")
    print(f"📋 Category: {ticket.category}, Status: {ticket.status}")
    print(f"👤 Assigned to: {ticket.assigned_agent}")
    print(f"🔧 Resolution preview: {ticket.resolution[:200] if ticket.resolution else 'No resolution'}...")

---
## 🔧 Troubleshooting Common Issues

### Command Object Issues
```python
# ❌ Common issue: Invalid Command usage
return Command("next_node")  # Wrong syntax

# ✅ Correct Command usage
return Command(goto="next_node", update=state.model_dump())
return Command(finish=state.model_dump())
```

### Message Validation Errors
```python
# ✅ Always validate messages before processing
def validate_message(msg: AgentMessage) -> bool:
    try:
        # Validate required fields
        return all([msg.from_agent, msg.to_agent, msg.content])
    except Exception:
        return False
```

### Secure Communication Setup
```python
# ✅ Proper secret key management
SECRET_KEY = os.getenv("AGENT_SECRET_KEY", "fallback-key-for-dev")
secure_comm = SecureAgentCommunication(SECRET_KEY)
```

### State Synchronization
```python
# ✅ Always update state consistently
def agent_node(state: MultiAgentState) -> Command:
    # Modify state
    state.current_agent = "new_agent"
    
    # Always return updated state
    return Command(goto="next_node", update=state.model_dump())
```

---
## 📖 Summary and Next Steps

### What You've Learned:
✅ **Multi-Agent Communication**: Agent handoffs and coordination patterns  
✅ **Command Objects**: Structured control transfer between agents  
✅ **Supervisor Patterns**: Central coordination with OpenAI decision making  
✅ **Secure Communication**: Encryption, signing, and validation  
✅ **Message Validation**: Input sanitization and security checks  
✅ **Network Patterns**: Distributed agent communication  

### Best Practices Covered:
- Use Command objects for clean agent handoffs
- Implement proper message validation and sanitization
- Secure inter-agent communication with encryption
- Design supervisor patterns for complex coordination
- Handle errors gracefully in multi-agent systems
- Log all agent interactions for debugging

### Tomorrow's Preview (Day 5):
🏗️ **Advanced Architectures & Tool Integration**
- Hierarchical multi-agent systems
- Parallel execution and map-reduce patterns
- Tool integration (Tavily, APIs) with OpenAI
- Performance and cost optimization
- Production deployment strategies

### Resources for Further Learning:
- [LangGraph Multi-Agent Documentation](https://langchain-ai.github.io/langgraph/concepts/multi_agent/)
- [Command Objects Reference](https://langchain-ai.github.io/langgraph/reference/types/)
- [Agent Communication Patterns](https://langchain-ai.github.io/langgraph/how-tos/)

**🎯 You're now ready to build sophisticated multi-agent systems with secure communication!**

In [None]:
# Clean up resources
print("🧹 Session complete! Database files created:")
import os
db_files = [f for f in os.listdir('.') if f.endswith('.db')]
for db_file in db_files:
    if any(x in db_file for x in ['multi_agent', 'supervisor', 'secure_comm', 'validation']):
        print(f"  📁 {db_file}")

print("\n🎉 Day 4 Complete! You've mastered multi-agent communication and handoffs in LangGraph.")
print("🚀 Ready for Day 5: Advanced Architectures & Tool Integration")