# Week 10: Multi-Agent Systems

## Overview
Welcome to Week 10 of the AI Engineering curriculum. This week focuses on building **multi-agent systems** where multiple autonomous agents collaborate to solve complex problems.

### Learning Objectives
By the end of this week, you will be able to:
- Design multi-agent architectures for scalable systems
- Implement task decomposition and delegation strategies
- Build agent communication protocols
- Create human-in-the-loop systems
- Develop coordination mechanisms for agent collaboration

### Real-World Outcome
Build a **Customer Support Multi-Agent System** that coordinates multiple specialized agents to handle customer inquiries efficiently.

---

## Part 1: Multi-Agent Architectures

### Types of Multi-Agent Systems

1. **Hierarchical**: Manager-worker pattern
2. **Pipeline**: Sequential processing chain
3. **Collaborative**: Peer-to-peer cooperation
4. **Competitive**: Agents with conflicting goals

### Common Patterns
- **Router**: Directs tasks to appropriate agents
- **Orchestrator**: Coordinates agent activities
- **Consensus**: Agents reach agreement
- **Specialist**: Domain-specific expert agents

### TODO 1.1: Implement Multi-Agent Architecture

In [None]:
from typing import List, Dict, Optional, Any
from dataclasses import dataclass, field
from enum import Enum
from abc import ABC, abstractmethod

class AgentRole(Enum):
    """Defines agent roles in multi-agent system."""
    MANAGER = "manager"
    ROUTER = "router"
    SPECIALIST = "specialist"
    EXECUTOR = "executor"

@dataclass
class Message:
    """Message passed between agents."""
    sender: str
    receiver: str
    content: str
    message_type: str
    priority: int = 1
    metadata: Dict[str, Any] = field(default_factory=dict)

class AgentInterface(ABC):
    """Base interface for all agents."""
    
    @abstractmethod
    def process_message(self, message: Message) -> Optional[Message]:
        """Process incoming message."""
        pass
    
    @abstractmethod
    def get_capabilities(self) -> List[str]:
        """Return list of agent capabilities."""
        pass

class MultiAgentSystem:
    """Manages multiple agents and their interactions."""
    
    def __init__(self, name: str):
        self.name = name
        self.agents: Dict[str, AgentInterface] = {}
        self.message_queue: List[Message] = []
        self.history: List[Dict] = []
    
    def register_agent(self, agent_id: str, agent: AgentInterface):
        """Register an agent in the system."""
        # TODO: Implement agent registration
        pass
    
    def send_message(self, message: Message):
        """Send message to an agent."""
        # TODO: Implement message sending with queue management
        pass
    
    def route_task(self, task: str) -> str:
        """Route task to appropriate agent."""
        # TODO: Implement task routing logic
        pass
    
    def coordinate_agents(self, task: Dict) -> Dict:
        """Coordinate multiple agents for complex task."""
        # TODO: Implement agent coordination
        pass

# Test multi-agent system
# TODO: Uncomment and test
# mas = MultiAgentSystem("CustomerSupport")
# print(f"Multi-agent system created: {mas.name}")

---

## Part 2: Task Decomposition & Delegation

### Breaking Down Complex Tasks

**Task Decomposition Strategies**:
1. **Functional**: By capability (search, analyze, write)
2. **Domain**: By expertise (tech, legal, finance)
3. **Sequential**: By workflow steps
4. **Parallel**: Independent subtasks

### Delegation Principles
- Match task requirements to agent capabilities
- Consider agent workload and availability
- Handle dependencies between subtasks
- Monitor progress and reallocate if needed

### TODO 2.1: Implement Task Decomposition

In [None]:
from typing import List, Dict, Set

@dataclass
class Subtask:
    """Represents a decomposed subtask."""
    id: str
    description: str
    required_capabilities: List[str]
    dependencies: List[str] = field(default_factory=list)
    assigned_agent: Optional[str] = None
    status: str = "pending"
    result: Optional[Any] = None

class TaskDecomposer:
    """Decomposes complex tasks into subtasks."""
    
    def __init__(self):
        self.decomposition_history: List[Dict] = []
    
    def decompose(self, task_description: str) -> List[Subtask]:
        """Break down task into subtasks."""
        # TODO: Implement task decomposition
        # Use LLM or rule-based logic to identify subtasks
        pass
    
    def identify_dependencies(self, subtasks: List[Subtask]) -> Dict[str, List[str]]:
        """Identify dependencies between subtasks."""
        # TODO: Implement dependency identification
        pass
    
    def create_execution_plan(self, subtasks: List[Subtask]) -> List[List[Subtask]]:
        """Create execution plan respecting dependencies."""
        # TODO: Implement topological sort for execution order
        pass

class TaskDelegator:
    """Delegates subtasks to appropriate agents."""
    
    def __init__(self, mas: MultiAgentSystem):
        self.mas = mas
        self.assignments: Dict[str, str] = {}
    
    def match_agent(self, subtask: Subtask) -> Optional[str]:
        """Find best agent for subtask."""
        # TODO: Implement agent matching based on capabilities
        pass
    
    def delegate(self, subtasks: List[Subtask]) -> Dict[str, str]:
        """Delegate all subtasks to agents."""
        # TODO: Implement delegation logic
        pass
    
    def rebalance_load(self) -> Dict[str, int]:
        """Rebalance workload across agents."""
        # TODO: Implement load balancing
        pass

# Test task decomposition
# TODO: Uncomment and test
# decomposer = TaskDecomposer()
# task = "Handle a complex customer refund request"
# subtasks = decomposer.decompose(task)
# print(f"Task decomposed into {len(subtasks)} subtasks")

---

## Part 3: Agent Communication Protocols

### Communication Patterns

1. **Point-to-Point**: Direct agent-to-agent
2. **Broadcast**: One-to-many
3. **Publish-Subscribe**: Event-driven
4. **Request-Reply**: Synchronous communication

### Message Types
- **Task**: Request to perform action
- **Result**: Response with outcome
- **Query**: Request for information
- **Status**: Progress update
- **Error**: Failure notification

### TODO 3.1: Implement Communication Protocol

In [None]:
from queue import PriorityQueue
from threading import Lock
from typing import Callable

class MessageBus:
    """Central message bus for agent communication."""
    
    def __init__(self):
        self.queue = PriorityQueue()
        self.subscribers: Dict[str, List[Callable]] = {}
        self.lock = Lock()
        self.message_log: List[Message] = []
    
    def publish(self, message: Message):
        """Publish message to bus."""
        # TODO: Implement message publishing
        pass
    
    def subscribe(self, topic: str, handler: Callable):
        """Subscribe to messages of specific type."""
        # TODO: Implement subscription
        pass
    
    def deliver(self, message: Message):
        """Deliver message to recipient."""
        # TODO: Implement message delivery
        pass
    
    def broadcast(self, message: Message):
        """Broadcast message to all agents."""
        # TODO: Implement broadcasting
        pass

class CommunicationProtocol:
    """Defines communication protocol between agents."""
    
    def __init__(self, message_bus: MessageBus):
        self.message_bus = message_bus
        self.conversation_contexts: Dict[str, List[Message]] = {}
    
    def send_task(self, sender: str, receiver: str, task: Dict) -> str:
        """Send task to another agent."""
        # TODO: Implement task sending
        pass
    
    def send_result(self, sender: str, receiver: str, result: Any) -> str:
        """Send result back to requesting agent."""
        # TODO: Implement result sending
        pass
    
    def request_collaboration(self, initiator: str, collaborators: List[str], task: Dict) -> str:
        """Request collaboration from multiple agents."""
        # TODO: Implement collaboration request
        pass
    
    def get_conversation_history(self, conversation_id: str) -> List[Message]:
        """Get history of a conversation."""
        # TODO: Implement conversation history retrieval
        pass

# Test communication protocol
# TODO: Uncomment and test
# bus = MessageBus()
# protocol = CommunicationProtocol(bus)
# msg = Message("agent1", "agent2", "Hello", "greeting")
# bus.publish(msg)

---

## Part 4: Human-in-the-Loop Systems

### When to Involve Humans

1. **High-stakes decisions**: Financial, legal, safety
2. **Uncertain situations**: Low confidence predictions
3. **Edge cases**: Outside training distribution
4. **Escalation**: Agent failures or conflicts
5. **Learning**: Feedback for improvement

### Human-Agent Interaction Patterns
- **Review & Approve**: Human approves agent decisions
- **Correct & Guide**: Human provides corrections
- **Exception Handling**: Human handles edge cases
- **Active Learning**: Human labels uncertain cases

### TODO 4.1: Implement Human-in-the-Loop

In [None]:
from dataclasses import dataclass
from typing import Optional, Callable
import time

@dataclass
class HumanReviewRequest:
    """Request for human review."""
    request_id: str
    agent_id: str
    context: Dict[str, Any]
    proposed_action: str
    confidence: float
    priority: int
    created_at: float = field(default_factory=time.time)

class HumanInTheLoop:
    """Manages human-in-the-loop interactions."""
    
    def __init__(self, review_threshold: float = 0.7):
        self.review_threshold = review_threshold
        self.pending_reviews: Dict[str, HumanReviewRequest] = {}
        self.review_history: List[Dict] = []
        self.approval_handler: Optional[Callable] = None
    
    def requires_human_review(self, confidence: float, task_type: str) -> bool:
        """Determine if task requires human review."""
        # TODO: Implement review requirement logic
        pass
    
    def request_review(self, request: HumanReviewRequest) -> str:
        """Request human review for a decision."""
        # TODO: Implement review request
        pass
    
    def submit_feedback(self, request_id: str, approved: bool, feedback: str = ""):
        """Submit human feedback on a request."""
        # TODO: Implement feedback submission
        pass
    
    def set_approval_handler(self, handler: Callable):
        """Set custom approval handler."""
        # TODO: Implement approval handler registration
        pass
    
    def get_pending_reviews(self, priority_filter: Optional[int] = None) -> List[HumanReviewRequest]:
        """Get pending review requests."""
        # TODO: Implement pending reviews retrieval
        pass

class EscalationManager:
    """Manages escalation of issues to humans."""
    
    def __init__(self):
        self.escalation_rules: Dict[str, Callable] = {}
        self.escalation_log: List[Dict] = []
    
    def add_escalation_rule(self, condition: str, handler: Callable):
        """Add escalation rule."""
        # TODO: Implement rule addition
        pass
    
    def check_escalation(self, context: Dict) -> bool:
        """Check if situation requires escalation."""
        # TODO: Implement escalation checking
        pass
    
    def escalate(self, issue: Dict, priority: int = 1):
        """Escalate issue to human."""
        # TODO: Implement escalation
        pass

# Test human-in-the-loop
# TODO: Uncomment and test
# hitl = HumanInTheLoop(review_threshold=0.7)
# request = HumanReviewRequest(
#     request_id="req001",
#     agent_id="support_agent",
#     context={"customer": "VIP"},
#     proposed_action="Process refund",
#     confidence=0.65,
#     priority=2
# )
# hitl.request_review(request)

---

## Part 5: Customer Support Multi-Agent System

### System Architecture

Our customer support system will have:
1. **Router Agent**: Classifies and routes inquiries
2. **Knowledge Agent**: Searches knowledge base
3. **Refund Agent**: Handles refund requests
4. **Technical Agent**: Handles technical issues
5. **Escalation Agent**: Manages complex cases

### Workflow
```
1. Customer inquiry arrives
2. Router classifies inquiry type
3. Router delegates to specialist agent
4. Specialist processes request
5. If needed, escalate to human
6. Return response to customer
```

### TODO 5.1: Build the Customer Support System

In [None]:
from typing import List, Dict, Optional

class CustomerInquiry:
    """Represents a customer inquiry."""
    
    def __init__(self, inquiry_id: str, customer_id: str, content: str):
        self.inquiry_id = inquiry_id
        self.customer_id = customer_id
        self.content = content
        self.category: Optional[str] = None
        self.priority: int = 1
        self.status: str = "pending"
        self.assigned_agent: Optional[str] = None
        self.response: Optional[str] = None

class RouterAgent(AgentInterface):
    """Routes customer inquiries to appropriate agents."""
    
    def __init__(self, agent_id: str):
        self.agent_id = agent_id
        self.routing_rules: Dict[str, str] = {}
    
    def classify_inquiry(self, inquiry: CustomerInquiry) -> str:
        """Classify inquiry into category."""
        # TODO: Implement inquiry classification
        pass
    
    def route_inquiry(self, inquiry: CustomerInquiry) -> str:
        """Route inquiry to appropriate agent."""
        # TODO: Implement routing logic
        pass
    
    def process_message(self, message: Message) -> Optional[Message]:
        """Process incoming message."""
        # TODO: Implement message processing
        pass
    
    def get_capabilities(self) -> List[str]:
        return ["classify", "route"]

class SpecialistAgent(AgentInterface):
    """Base class for specialist agents."""
    
    def __init__(self, agent_id: str, specialty: str):
        self.agent_id = agent_id
        self.specialty = specialty
        self.knowledge_base: Dict[str, str] = {}
    
    def handle_inquiry(self, inquiry: CustomerInquiry) -> str:
        """Handle customer inquiry."""
        # TODO: Implement inquiry handling
        pass
    
    def process_message(self, message: Message) -> Optional[Message]:
        # TODO: Implement message processing
        pass
    
    def get_capabilities(self) -> List[str]:
        return [self.specialty]

class CustomerSupportSystem:
    """Complete customer support multi-agent system."""
    
    def __init__(self):
        self.mas = MultiAgentSystem("CustomerSupport")
        self.message_bus = MessageBus()
        self.hitl = HumanInTheLoop()
        self.inquiry_queue: List[CustomerInquiry] = []
        self._setup_agents()
    
    def _setup_agents(self):
        """Initialize all specialist agents."""
        # TODO: Create and register all agents
        pass
    
    def handle_inquiry(self, inquiry: CustomerInquiry) -> Dict:
        """Process customer inquiry through the system."""
        # TODO: Implement end-to-end inquiry handling
        # 1. Receive inquiry
        # 2. Route to appropriate agent
        # 3. Process with specialist
        # 4. Check if human review needed
        # 5. Return response
        pass
    
    def get_system_status(self) -> Dict:
        """Get current status of the system."""
        # TODO: Implement status reporting
        pass

# Test customer support system
# TODO: Uncomment and test
# css = CustomerSupportSystem()
# inquiry = CustomerInquiry(
#     inquiry_id="INQ001",
#     customer_id="CUST123",
#     content="I need a refund for order #456"
# )
# result = css.handle_inquiry(inquiry)
# print(f"Inquiry handled: {result}")

---

## Part 6: Coordination & Monitoring

### Coordination Mechanisms

1. **Centralized**: Single coordinator
2. **Decentralized**: Peer coordination
3. **Hybrid**: Mix of both

### Monitoring Metrics
- Agent utilization
- Task completion time
- Success rates
- Escalation frequency
- Customer satisfaction

### TODO 6.1: Implement Coordination & Monitoring

In [None]:
import time
from typing import Dict, List

class AgentCoordinator:
    """Coordinates activities across multiple agents."""
    
    def __init__(self, mas: MultiAgentSystem):
        self.mas = mas
        self.active_tasks: Dict[str, Dict] = {}
        self.coordination_log: List[Dict] = []
    
    def assign_task(self, task: Dict, agent_id: str) -> str:
        """Assign task to specific agent."""
        # TODO: Implement task assignment
        pass
    
    def monitor_progress(self, task_id: str) -> Dict:
        """Monitor progress of a task."""
        # TODO: Implement progress monitoring
        pass
    
    def rebalance_workload(self) -> Dict[str, int]:
        """Rebalance workload across agents."""
        # TODO: Implement workload rebalancing
        pass
    
    def handle_agent_failure(self, agent_id: str, task_id: str):
        """Handle agent failure by reassigning tasks."""
        # TODO: Implement failure handling
        pass

class SystemMonitor:
    """Monitors multi-agent system performance."""
    
    def __init__(self):
        self.metrics: Dict[str, List[float]] = {
            "response_time": [],
            "success_rate": [],
            "agent_utilization": [],
            "escalation_rate": []
        }
        self.start_time = time.time()
    
    def record_metric(self, metric_name: str, value: float):
        """Record a metric value."""
        # TODO: Implement metric recording
        pass
    
    def get_agent_performance(self, agent_id: str) -> Dict:
        """Get performance metrics for an agent."""
        # TODO: Implement agent performance reporting
        pass
    
    def get_system_health(self) -> Dict:
        """Get overall system health metrics."""
        # TODO: Implement system health reporting
        pass
    
    def generate_report(self) -> str:
        """Generate monitoring report."""
        # TODO: Implement report generation
        pass

# Test coordination and monitoring
# TODO: Uncomment and test
# monitor = SystemMonitor()
# monitor.record_metric("response_time", 1.5)
# monitor.record_metric("success_rate", 0.95)
# print(monitor.get_system_health())

---

## Summary and Next Steps

### What You've Learned
- Multi-agent system architectures and patterns
- Task decomposition and delegation strategies
- Agent communication protocols
- Human-in-the-loop integration
- Building a complete customer support multi-agent system
- Coordination and monitoring mechanisms

### Next Week Preview
Week 11 will cover **Deployment, Monitoring & Reliability**, where you'll learn:
- FastAPI for AI services
- Dockerization of AI systems
- Cloud deployment basics
- Production monitoring and logging
- Building a deployed AI service

### Further Practice
1. Add more specialist agents to the customer support system
2. Implement advanced routing algorithms
3. Build agent performance optimization
4. Create agent collaboration scenarios
5. Implement consensus mechanisms

---

**Great job on completing Week 10!** 🎉