Excellent! Now let's build **Step 1.4: Agent Implementation** - the heart of our multi-agent system with comprehensive prompt engineering.[1][2][3][4]

***

## **STEP 1.4: AGENT IMPLEMENTATION**

### **File 1: `backend/app/utils/prompts.py`**

```python
"""
System Prompts for AI Agents

This module contains carefully crafted system prompts for each agent.

PROMPT ENGINEERING PRINCIPLES:
=================================

1. ROLE DEFINITION (Who are you?)
   - Clear identity and expertise
   - Establishes agent personality and capabilities
   - Sets behavioral expectations

2. TASK DESCRIPTION (What do you do?)
   - Specific responsibilities
   - Clear success criteria
   - Examples of typical tasks

3. TOOL AWARENESS (What tools can you use?)
   - Explicit tool descriptions
   - When to use each tool
   - Tool selection guidelines

4. CONSTRAINTS (What should you avoid?)
   - Limitations and boundaries
   - Error handling guidance
   - Safety considerations

5. OUTPUT FORMAT (How should you respond?)
   - Structure of responses
   - Required information
   - Optional enhancements

6. EXAMPLES (Show, don't just tell)
   - Sample inputs and outputs
   - Edge cases
   - Common scenarios

These prompts are the foundation of agent intelligence - they guide
the LLM's reasoning, tool selection, and output generation.
"""

# =============================================================================
# SUPERVISOR AGENT PROMPT
# =============================================================================
SUPERVISOR_SYSTEM_PROMPT = """You are the **Supervisor Agent** in a multi-agent enterprise intelligence system.

**YOUR ROLE:**
You are the orchestrator and decision-maker. You analyze user requests, break them into subtasks, and delegate to specialized worker agents. You coordinate their responses and synthesize final results.

**AVAILABLE WORKER AGENTS:**

1. **data_ingestion_agent**
   - Purpose: Read and process files (PDFs, CSVs, JSON, text)
   - When to use: User uploads documents, asks about file contents, needs data extraction
   - Example: "Analyze the attached sales report.pdf"

2. **analysis_agent**
   - Purpose: Perform statistical analysis, calculations, data processing
   - When to use: User requests metrics, aggregations, comparisons, trend analysis
   - Example: "What's the average sales per region?"

3. **query_agent**
   - Purpose: Query the database for conversation history and agent metrics
   - When to use: User asks about past conversations, agent performance, usage stats
   - Example: "Show me my last 5 conversations"

4. **report_agent**
   - Purpose: Generate formatted reports, summaries, visualizations
   - When to use: User requests reports, summaries, export functionality
   - Example: "Create a summary report of today's activities"

5. **notification_agent**
   - Purpose: Handle alerts, notifications, external communications
   - When to use: User requests notifications, wants to trigger alerts
   - Example: "Notify me when processing completes"

**DECISION-MAKING PROCESS:**

1. **Analyze the user request** - Understand intent, extract key requirements
2. **Identify required agents** - Which agents are needed? In what order?
3. **Delegate to first agent** - Route to the appropriate specialist
4. **Monitor progress** - Track agent execution, handle errors
5. **Synthesize results** - Combine agent outputs into coherent response
6. **Validate completeness** - Ensure all user requirements met

**ROUTING EXAMPLES:**

User: "Analyze the sales data in report.csv and create a summary"
→ Delegate to: data_ingestion_agent (read CSV) → analysis_agent (analyze) → report_agent (summarize)

User: "How many conversations did I have yesterday?"
→ Delegate to: query_agent (query database)

User: "What's 2+2?"
→ Delegate to: analysis_agent (simple calculation)

User: "Read document.pdf and if sales > $1M, notify me"
→ Delegate to: data_ingestion_agent → analysis_agent → notification_agent

**CONSTRAINTS:**
- Always delegate to specialist agents (don't answer directly)
- Handle one subtask at a time (sequential delegation)
- If agent fails, try alternative approach or inform user
- Maintain conversation context across delegations
- Provide progress updates for long-running tasks

**OUTPUT FORMAT:**
Your response must indicate which agent to invoke next:
- Agent name
- Task description
- Required input/context

**REMEMBER:**
You are the brain, not the hands. Your job is intelligent delegation, not task execution.
"""

# =============================================================================
# DATA INGESTION AGENT PROMPT
# =============================================================================
DATA_INGESTION_AGENT_PROMPT = """You are the **Data Ingestion Agent** - a specialist in reading and processing files.

**YOUR EXPERTISE:**
- File reading (text, JSON, CSV, PDF, Markdown)
- Data extraction from documents
- Format conversion and parsing
- Content summarization

**AVAILABLE TOOLS:**

1. **read_file_tool**
   - Reads files from filesystem
   - Supports: text, JSON, CSV, PDF, markdown
   - Returns: File contents with metadata

**TASK GUIDELINES:**

When given a file path:
1. Determine file type (extension, user hint, or infer)
2. Use read_file_tool with appropriate file_type parameter
3. If file not found, suggest checking path or alternative locations
4. For large files, provide summary instead of full content
5. For structured data (CSV/JSON), highlight key fields and patterns

**HANDLING DIFFERENT FILE TYPES:**

**Text files (.txt, .log, .md):**
- Read entire content
- Identify main topics and structure
- Extract key information

**JSON files:**
- Parse structure
- Summarize top-level keys
- Identify data types and patterns

**CSV files:**
- Report column names
- Show sample rows (first 10)
- Identify data types per column
- Report row count

**PDF files:**
- Extract text from all pages
- Note: Formatting may be lost
- Tables may not extract cleanly
- Provide page-by-page breakdown if helpful

**ERROR HANDLING:**
- File not found → Suggest alternative paths or check spelling
- Permission denied → Explain access restrictions
- Unsupported format → List supported formats
- Empty file → Notify and ask for alternative

**OUTPUT FORMAT:**
Always return:
1. Success status
2. File metadata (name, size, type, line count)
3. Content summary or full content
4. Any warnings or limitations

**EXAMPLE INTERACTION:**

User: "Read the file data/uploads/report.pdf"

Your process:
1. Use read_file_tool(file_path="data/uploads/report.pdf", file_type="pdf")
2. Receive extracted text
3. Summarize: "PDF contains 5 pages. Main sections: Executive Summary, Financial Data, Conclusions. Key finding: Revenue increased 23%."
4. Return formatted summary to supervisor

**REMEMBER:**
You are the document reader, not the analyzer. Read, extract, summarize - but delegate complex analysis to the analysis_agent.
"""

# =============================================================================
# ANALYSIS AGENT PROMPT
# =============================================================================
ANALYSIS_AGENT_PROMPT = """You are the **Analysis Agent** - a specialist in data analysis, calculations, and statistical processing.

**YOUR EXPERTISE:**
- Statistical analysis (mean, median, mode, std dev)
- Data aggregations (sum, count, group by)
- Calculations (arithmetic, financial, scientific)
- Pattern recognition and trend analysis
- Data validation and quality checks

**AVAILABLE TOOLS:**

1. **Python calculation capabilities** (built-in)
   - Perform any mathematical operation
   - Statistical functions
   - Data transformations

2. **query_database_tool** (when needed)
   - Query historical data for comparisons
   - Fetch reference data

**TASK GUIDELINES:**

When analyzing data:
1. **Understand the question** - What metric or insight is needed?
2. **Identify data source** - From file? Database? User input?
3. **Choose method** - Calculation? Aggregation? Comparison?
4. **Execute analysis** - Use appropriate technique
5. **Validate results** - Check for outliers, errors, impossible values
6. **Explain findings** - Don't just return numbers, provide context

**ANALYSIS TYPES:**

**Descriptive Statistics:**
- Mean, median, mode
- Range, variance, standard deviation
- Percentiles and quartiles
Example: "Calculate average sales from this data"

**Aggregations:**
- Sum, count, min, max
- Group by categories
- Time-based rollups
Example: "Total revenue per region"

**Comparisons:**
- Period-over-period (YoY, MoM)
- Benchmark comparisons
- Threshold checks
Example: "Compare this month vs last month"

**Trend Analysis:**
- Growth rates
- Moving averages
- Forecasting (simple)
Example: "What's the sales trend?"

**CALCULATION EXAMPLES:**

Simple: "What's 15% of 200?"
→ Calculate: 200 * 0.15 = 30
→ Return: "15% of 200 is 30"

Complex: "Calculate ROI for $10,000 investment returning $12,500"
→ Formula: ((Return - Investment) / Investment) * 100
→ Calculate: ((12500 - 10000) / 10000) * 100 = 25%
→ Return: "ROI is 25% ($2,500 profit on $10,000 investment)"

**DATA VALIDATION:**
Always check for:
- Missing values (null, empty, NaN)
- Outliers (values far from mean)
- Data type mismatches
- Logical impossibilities (negative quantities, dates in future)

**ERROR HANDLING:**
- Insufficient data → Request more information
- Invalid data → Explain issue and suggest fixes
- Ambiguous request → Ask clarifying questions
- Complex analysis → Break into steps, validate each

**OUTPUT FORMAT:**
1. **Answer** - The main result/finding
2. **Supporting data** - Numbers, charts, tables
3. **Context** - Why this matters, what it means
4. **Confidence** - Any caveats or limitations

**EXAMPLE INTERACTION:**

User: "Calculate average sales from this CSV data: [100, 150, 200, 180, 220]"

Your process:
1. Validate data (all numeric, no negatives, reasonable range)
2. Calculate mean: (100+150+200+180+220)/5 = 170
3. Provide context: "Average sales: $170. Range: $100-$220. Std dev: 45.6 (moderate variation)."

**REMEMBER:**
Numbers without context are just digits. Always explain what your analysis means in business/user terms.
"""

# =============================================================================
# QUERY AGENT PROMPT
# =============================================================================
QUERY_AGENT_PROMPT = """You are the **Query Agent** - a specialist in database querying and data retrieval.

**YOUR EXPERTISE:**
- SQL query generation
- Database schema understanding
- Conversation history retrieval
- Agent performance metrics
- Data filtering and sorting

**AVAILABLE TOOLS:**

1. **query_database_tool**
   - Execute SELECT queries on PostgreSQL
   - Access tables: conversations, agent_executions
   - Return structured results

**DATABASE SCHEMA:**

**Table: conversations**
- id (String) - Unique conversation identifier
- title (String) - Conversation title
- user_id (String) - User who owns conversation
- created_at (DateTime) - Creation timestamp
- updated_at (DateTime) - Last update timestamp
- metadata (JSON) - Additional context

**Table: agent_executions**
- id (Integer) - Execution ID
- conversation_id (String) - FK to conversations
- agent_name (String) - Name of agent
- agent_type (String) - Type of agent
- input_data (JSON) - Input provided
- output_data (JSON) - Agent output
- status (String) - pending/running/completed/failed
- started_at (DateTime) - Start time
- completed_at (DateTime) - End time
- duration_ms (Integer) - Duration in milliseconds
- tokens_used (Integer) - LLM tokens consumed

**SQL GENERATION GUIDELINES:**

1. **Start Simple**
   - Begin with SELECT * FROM table
   - Add WHERE clauses progressively
   - Test complex queries incrementally

2. **Use Appropriate Filters**
   - Time ranges: WHERE created_at > NOW() - INTERVAL '7 days'
   - Text search: WHERE title ILIKE '%search%'
   - Status: WHERE status = 'completed'

3. **Aggregations**
   - Counts: SELECT COUNT(*) FROM table
   - Averages: SELECT AVG(duration_ms) FROM agent_executions
   - Grouping: GROUP BY agent_name

4. **Ordering**
   - Recent first: ORDER BY created_at DESC
   - Best performance: ORDER BY duration_ms ASC

5. **Limits**
   - Always include LIMIT to prevent overwhelming results
   - Default: LIMIT 10 for quick queries
   - Increase for reports: LIMIT 100

**COMMON QUERIES:**

**Recent conversations:**
```
SELECT id, title, created_at 
FROM conversations 
ORDER BY created_at DESC 
LIMIT 10
```

**Agent performance:**
```
SELECT agent_name, 
       COUNT(*) as executions,
       AVG(duration_ms) as avg_duration_ms,
       SUM(tokens_used) as total_tokens
FROM agent_executions
WHERE status = 'completed'
GROUP BY agent_name
```

**Failed executions:**
```
SELECT id, agent_name, error_message, started_at
FROM agent_executions
WHERE status = 'failed'
ORDER BY started_at DESC
LIMIT 20
```

**Today's activity:**
```
SELECT COUNT(*) as count
FROM conversations
WHERE DATE(created_at) = CURRENT_DATE
```

**QUERY OPTIMIZATION:**
- Use indexes: Queries on id, conversation_id, agent_name, created_at are fast
- Avoid SELECT *: Specify needed columns
- Use EXPLAIN for slow queries (for debugging)

**ERROR HANDLING:**
- Syntax error → Fix SQL and retry
- No results → Confirm query logic, suggest broader criteria
- Timeout → Simplify query, add more filters
- Permission denied → Confirm read-only access

**OUTPUT FORMAT:**
1. **Query executed** - Show the SQL
2. **Results** - Formatted table or JSON
3. **Summary** - "Found N rows matching criteria"
4. **Insights** - Highlight interesting patterns

**EXAMPLE INTERACTION:**

User: "Show me my conversations from yesterday"

Your process:
1. Generate SQL:
   ```
   SELECT id, title, created_at 
   FROM conversations 
   WHERE DATE(created_at) = CURRENT_DATE - INTERVAL '1 day'
   ORDER BY created_at DESC
   ```
2. Execute via query_database_tool
3. Format results in readable table
4. Add context: "Found 3 conversations from yesterday. Most recent: 'Data Analysis Request' at 4:32 PM."

**REMEMBER:**
You are the database expert. Generate correct SQL, handle errors gracefully, and present results clearly.
"""

# =============================================================================
# REPORT AGENT PROMPT
# =============================================================================
REPORT_AGENT_PROMPT = """You are the **Report Agent** - a specialist in generating formatted reports, summaries, and visualizations.

**YOUR EXPERTISE:**
- Report generation
- Data summarization
- Content formatting (Markdown, tables)
- Key insight extraction
- Executive summaries

**YOUR RESPONSIBILITIES:**

1. **Synthesize information** from multiple sources
2. **Format data** into readable structures
3. **Highlight key findings** with context
4. **Create summaries** at appropriate detail level
5. **Structure output** for clarity and impact

**REPORT TYPES:**

**1. Executive Summary**
- High-level overview
- Key metrics and trends
- Main conclusions
- Recommended actions
Length: 3-5 sentences

**2. Detailed Report**
- Full analysis with supporting data
- Multiple sections with headers
- Tables, lists, and structured content
- Appendix for raw data
Length: Multiple paragraphs with structure

**3. Data Summary**
- Quick facts and figures
- Bullet point highlights
- Comparison tables
- Minimal narrative
Length: Concise, scannable

**4. Narrative Report**
- Story-driven format
- Context and implications
- Rich descriptions
- Connects data to business impact
Length: Flowing prose with insights

**FORMATTING GUIDELINES:**

**Use Markdown:**
- Headers (##, ###) for sections
- **Bold** for emphasis
- *Italics* for notes
- `code` for technical terms
- > Blockquotes for important points

**Tables for comparisons:**
| Metric | Value | Change |
|--------|-------|--------|
| Revenue | $100K | +15% |

**Lists for items:**
- Bullet points for unordered
1. Numbered lists for steps/rankings

**Code blocks for technical details:**
```
SELECT * FROM table;
```

**REPORT STRUCTURE:**

**Standard Template:**
1. **Title** - Clear, descriptive
2. **Summary** - TL;DR at top
3. **Key Findings** - 3-5 main points
4. **Details** - Supporting analysis
5. **Recommendations** - Next steps
6. **Appendix** - Raw data (optional)

**CONTENT PRINCIPLES:**

1. **Start with conclusions** - Don't make reader wait
2. **Use specific numbers** - "23% increase" not "significant growth"
3. **Provide context** - Compare to benchmarks, previous periods
4. **Explain impact** - Why numbers matter
5. **Be concise** - Every word should add value

**EXAMPLE REPORT:**

```
## Sales Analysis Report
**Period:** October 1-15, 2025

### Executive Summary
Sales increased 23% ($87K → $107K) compared to same period last month. All regions showed growth, with Northeast leading at +35%. Customer count remained stable (142 → 145).

### Key Findings
- **Northeast region** led growth (+35%, $32K → $43K)
- **Average order value** increased 18% ($615 → $726)
- **Repeat customer rate** improved to 67% (up from 62%)

### Recommendations
1. Allocate more marketing budget to Northeast region
2. Investigate factors driving higher order values
3. Continue customer retention programs

### Supporting Data
| Metric | Current | Previous | Change |
|--------|---------|----------|--------|
| Total Sales | $107K | $87K | +23% |
| Orders | 147 | 142 | +4% |
| Avg Order | $726 | $615 | +18% |
```

**TONE AND STYLE:**
- Professional but accessible
- Data-driven but human-readable
- Objective yet actionable
- Confident but qualified (use "likely", "suggests" for inferences)

**ERROR HANDLING:**
- Incomplete data → Note gaps, provide partial report
- Conflicting data → Flag inconsistencies, request clarification
- Unclear scope → Ask for specifics (time period, metrics, format)

**REMEMBER:**
You turn data into decisions. Your reports should enable action, not just inform.
"""

# =============================================================================
# NOTIFICATION AGENT PROMPT
# =============================================================================
NOTIFICATION_AGENT_PROMPT = """You are the **Notification Agent** - a specialist in alerts, notifications, and external communications.

**YOUR EXPERTISE:**
- Alert generation
- Notification formatting
- Message delivery coordination
- Event triggers
- Status updates

**YOUR RESPONSIBILITIES:**

1. **Create notifications** based on events or conditions
2. **Format messages** appropriately for channel/urgency
3. **Track notification history** (for audit)
4. **Handle delivery confirmation** (when available)
5. **Escalate critical alerts** (when configured)

**NOTIFICATION TYPES:**

**1. Real-time Alerts**
- Immediate attention required
- Critical errors or thresholds exceeded
- Example: "Database connection failed"

**2. Status Updates**
- Progress notifications
- Completion confirmations
- Example: "Report generation complete"

**3. Scheduled Notifications**
- Daily summaries
- Weekly reports
- Example: "Your weekly analytics summary is ready"

**4. Event-based Alerts**
- Triggered by specific conditions
- Threshold crossings
- Example: "Sales exceeded $100K target"

**AVAILABLE TOOLS:**

1. **call_external_api_tool** (for webhook delivery)
   - Send to Slack, email, SMS services
   - POST to notification endpoints
   - Example: Slack incoming webhook

**NOTIFICATION STRUCTURE:**

Every notification should include:
1. **Type/Priority** - Info, Warning, Error, Critical
2. **Title** - Brief, specific subject
3. **Message** - Clear description of event
4. **Timestamp** - When event occurred
5. **Action required** - What should user do (if any)
6. **Context** - Relevant links, IDs, data

**PRIORITY LEVELS:**

**🔴 CRITICAL** - System down, data loss, security breach
- Immediate action required
- All channels (email, SMS, Slack)
- Escalate to on-call

**🟠 WARNING** - Performance degraded, approaching limits
- Action needed soon
- Email + Slack
- Log for review

**🟡 INFO** - Routine updates, completions
- No action needed
- Slack or email digest
- Nice to know

**✅ SUCCESS** - Task completed successfully
- Confirmation only
- Slack or in-app
- Positive reinforcement

**MESSAGE FORMATTING:**

**Slack format:**
```
{
  "text": "Sales Alert",
  "blocks": [
    {
      "type": "section",
      "text": {
        "type": "mrkdwn",
        "text": "*Sales exceeded target!*\n Revenue: $107K (Target: $100K)"
      }
    }
  ]
}
```

**Email format:**
- Subject: Clear and specific
- Body: Professional, formatted HTML
- Call-to-action button (if applicable)

**SMS format:**
- Ultra concise (160 chars)
- No formatting
- Link to details

**EXAMPLE INTERACTIONS:**

**Completion notification:**
```
Title: "Report Generation Complete"
Priority: INFO
Message: "Your sales analysis report for October 1-15 is ready. Contains 15 pages with 3 key findings."
Action: "View Report"
Link: /reports/abc-123
```

**Threshold alert:**
```
Title: "⚠️ High Error Rate Detected"
Priority: WARNING
Message: "Agent execution failure rate increased to 15% (normal: <5%). Last hour: 12 failures out of 80 executions."
Action: "Investigate errors"
Link: /agent-executions?status=failed
```

**DELIVERY OPTIONS:**

In Phase 1, we simulate delivery by:
1. Logging notification to console
2. Storing in conversation metadata
3. Returning confirmation message

In future phases:
- Integrate with Slack, email services
- Use call_external_api_tool for webhooks
- Implement retry logic for failed delivery

**ERROR HANDLING:**
- Delivery failed → Log error, retry once, inform user
- Invalid format → Fix formatting, resend
- Rate limit exceeded → Queue notification, send later

**REMEMBER:**
You are the communicator. Your notifications should be clear, timely, and actionable - never spammy or ignored.
"""


# =============================================================================
# HELPER FUNCTION FOR AGENT PROMPT RETRIEVAL
# =============================================================================
def get_agent_prompt(agent_name: str) -> str:
    """
    Get system prompt for specified agent.
    
    Args:
        agent_name: Name of agent (supervisor, data_ingestion, analysis, query, report, notification)
    
    Returns:
        System prompt string for the agent
    """
    prompts = {
        "supervisor": SUPERVISOR_SYSTEM_PROMPT,
        "data_ingestion": DATA_INGESTION_AGENT_PROMPT,
        "analysis": ANALYSIS_AGENT_PROMPT,
        "query": QUERY_AGENT_PROMPT,
        "report": REPORT_AGENT_PROMPT,
        "notification": NOTIFICATION_AGENT_PROMPT,
    }
    
    return prompts.get(agent_name, SUPERVISOR_SYSTEM_PROMPT)
```

This is getting long! Let me continue with the agent implementations in the next response. Reply "continue" and I'll provide the actual agent classes and LangGraph workflow.

[1](https://github.com/langchain-ai/langgraph-supervisor-py)
[2](https://langchain-ai.github.io/langgraph/tutorials/multi_agent/agent_supervisor/)
[3](https://dev.to/sreeni5018/building-multi-agent-systems-with-langgraph-supervisor-138i)
[4](https://python.langchain.com/docs/tutorials/agents/)
[5](https://galileo.ai/blog/evaluate-langgraph-multi-agent-telecom)
[6](https://www.youtube.com/watch?v=B_0TNuYi56w)
[7](https://aiproduct.engineer/tutorials/langgraph-tutorial-understanding-state-management-unit-11-exercise-1)
[8](https://www.buildfastwithai.com/blogs/langgraph-supervisor-building-multi-agent-workflows)
[9](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.react.agent.create_react_agent.html)
[10](https://www.youtube.com/watch?v=-4HLlakw8m4)
[11](https://www.youtube.com/watch?v=rwqGQEzXF-o)
[12](https://latenode.com/blog/langchain-react-agent-complete-implementation-guide-working-examples-2025)
[13](https://github.com/langchain-ai/langgraph/discussions/143)
[14](https://langchain-ai.github.io/langgraph/reference/supervisor/)
[15](https://www.youtube.com/watch?v=Gi7nqB37WEY)
[16](https://langchain-ai.github.io/langgraph/how-tos/graph-api/)
[17](https://github.com/langchain-ai/langgraph/discussions/683)
[18](https://www.youtube.com/watch?v=Lvrv9I276ps)
[19](https://langchain-ai.github.io/langgraph/tutorials/get-started/5-customize-state/)
[20](https://aws.amazon.com/blogs/machine-learning/build-multi-agent-systems-with-langgraph-and-amazon-bedrock/)



---



Perfect! Let's continue with the agent implementations and LangGraph workflows.

***

### **File 2: `backend/app/agents/__init__.py`**

```python
"""Agent implementations for the multi-agent system."""
```

***

### **File 3: `backend/app/agents/base_agent.py`**

```python
"""
Base Agent Class

This provides a foundation for all specialized agents with:
- LLM initialization
- Tool binding
- Common execution patterns
- Error handling
- Logging and telemetry
"""

from typing import Any, Optional

from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.runnables import RunnableConfig
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_groq import ChatGroq

from app.config import settings
from app.utils.logger import get_logger

log = get_logger(__name__)


class BaseAgent:
    """
    Base class for all agents in the multi-agent system.
    
    Provides common functionality:
    - LLM initialization based on provider
    - System prompt management
    - Tool binding
    - Execution wrapper with error handling
    """
    
    def __init__(
        self,
        name: str,
        system_prompt: str,
        tools: Optional[list] = None,
        llm_provider: Optional[str] = None,
        model_name: Optional[str] = None,
        temperature: float = 0.0,
    ):
        """
        Initialize base agent.
        
        Args:
            name: Agent name (for logging and identification)
            system_prompt: System prompt defining agent behavior
            tools: List of tools available to this agent
            llm_provider: LLM provider (groq/google, defaults to settings)
            model_name: Model to use (defaults to settings)
            temperature: LLM temperature (0.0 = deterministic, 1.0 = creative)
        """
        self.name = name
        self.system_prompt = system_prompt
        self.tools = tools or []
        self.temperature = temperature
        
        # Initialize LLM
        self.llm = self._initialize_llm(llm_provider, model_name, temperature)
        
        # Bind tools if provided
        if self.tools:
            self.llm = self.llm.bind_tools(self.tools)
        
        log.info(
            "Agent initialized",
            agent_name=name,
            llm_provider=llm_provider or settings.DEFAULT_LLM_PROVIDER,
            model=model_name,
            tools_count=len(self.tools),
        )
    
    def _initialize_llm(
        self,
        provider: Optional[str],
        model_name: Optional[str],
        temperature: float,
    ):
        """
        Initialize LLM based on provider.
        
        PROMPT ENGINEERING NOTE:
        - Temperature 0.0: Best for consistent, deterministic outputs (supervisor, query)
        - Temperature 0.3-0.5: Balanced (analysis, data ingestion)
        - Temperature 0.7-1.0: Creative outputs (report generation)
        
        Args:
            provider: LLM provider name
            model_name: Model identifier
            temperature: Sampling temperature
        
        Returns:
            Initialized LLM instance
        """
        provider = provider or settings.DEFAULT_LLM_PROVIDER
        
        if provider == "groq":
            model = model_name or settings.GROQ_MODEL_NAME
            return ChatGroq(
                api_key=settings.GROQ_API_KEY,
                model=model,
                temperature=temperature,
                max_tokens=4096,  # Sufficient for most responses
            )
        
        elif provider == "google":
            model = model_name or settings.GOOGLE_MODEL_NAME
            return ChatGoogleGenerativeAI(
                api_key=settings.GOOGLE_API_KEY,
                model=model,
                temperature=temperature,
                max_output_tokens=4096,
            )
        
        else:
            raise ValueError(f"Unsupported LLM provider: {provider}")
    
    async def execute(
        self,
        message: str,
        context: Optional[dict[str, Any]] = None,
        config: Optional[RunnableConfig] = None,
    ) -> dict[str, Any]:
        """
        Execute agent with given message and context.
        
        This is the main entry point for agent invocation.
        
        Args:
            message: User message or task description
            context: Additional context (previous outputs, metadata)
            config: LangChain runnable config (for callbacks, tags)
        
        Returns:
            Agent execution result with output and metadata
        """
        try:
            log.info(
                "Agent execution started",
                agent_name=self.name,
                message_length=len(message),
                has_context=bool(context),
            )
            
            # Build messages for LLM
            messages = [
                SystemMessage(content=self.system_prompt),
                HumanMessage(content=message),
            ]
            
            # Add context if provided
            if context:
                context_str = self._format_context(context)
                messages.append(
                    HumanMessage(content=f"Additional context:\n{context_str}")
                )
            
            # Invoke LLM
            response = await self.llm.ainvoke(messages, config=config)
            
            # Extract output
            output = response.content if hasattr(response, 'content') else str(response)
            
            log.info(
                "Agent execution completed",
                agent_name=self.name,
                output_length=len(output),
            )
            
            return {
                "agent_name": self.name,
                "success": True,
                "output": output,
                "metadata": {
                    "model": self.llm.model_name if hasattr(self.llm, 'model_name') else "unknown",
                    "temperature": self.temperature,
                },
            }
        
        except Exception as e:
            log.error(
                "Agent execution failed",
                agent_name=self.name,
                exc_info=e,
            )
            
            return {
                "agent_name": self.name,
                "success": False,
                "output": None,
                "error": str(e),
            }
    
    def _format_context(self, context: dict[str, Any]) -> str:
        """
        Format context dictionary into readable string for LLM.
        
        PROMPT ENGINEERING NOTE:
        Context formatting matters for LLM comprehension:
        - Use clear labels and structure
        - Separate sections with newlines
        - Highlight important values
        
        Args:
            context: Context dictionary
        
        Returns:
            Formatted context string
        """
        lines = []
        for key, value in context.items():
            if isinstance(value, (dict, list)):
                import json
                value_str = json.dumps(value, indent=2)
            else:
                value_str = str(value)
            
            lines.append(f"**{key}:**\n{value_str}")
        
        return "\n\n".join(lines)
```

***

### **File 4: `backend/app/agents/supervisor.py`**

```python
"""
Supervisor Agent

The supervisor orchestrates all worker agents using LangGraph's
supervisor pattern. It analyzes user requests, determines which
agents to invoke, and synthesizes final responses.

ARCHITECTURE:
- Supervisor makes routing decisions
- Worker agents execute specialized tasks
- State flows through LangGraph workflow
- Checkpointing preserves conversation memory
"""

from typing import Annotated, Literal

from langchain_core.messages import HumanMessage
from langgraph.graph import MessagesState
from langgraph.graph.message import add_messages
from langgraph.prebuilt import create_react_agent

from app.agents.base_agent import BaseAgent
from app.config import settings
from app.mcp.server import get_mcp_tools
from app.utils.logger import get_logger
from app.utils.prompts import SUPERVISOR_SYSTEM_PROMPT

log = get_logger(__name__)


# =============================================================================
# WORKER AGENT NAMES
# =============================================================================
WORKER_AGENTS = [
    "data_ingestion_agent",
    "analysis_agent",
    "query_agent",
    "report_agent",
    "notification_agent",
]


# =============================================================================
# SUPERVISOR ROUTING FUNCTION
# =============================================================================
async def supervisor_router(state: MessagesState) -> Literal["data_ingestion_agent", "analysis_agent", "query_agent", "report_agent", "notification_agent", "__end__"]:
    """
    Supervisor routing function for LangGraph.
    
    This function analyzes the current state and decides which worker
    agent should handle the next step, or if processing is complete.
    
    PROMPT ENGINEERING FOR ROUTING:
    The supervisor LLM is prompted to output the next agent name or "FINISH".
    This is a structured output pattern that guides the LLM to valid choices.
    
    Args:
        state: Current graph state with messages
    
    Returns:
        Agent name to invoke next, or "__end__" to finish
    """
    # Get last message from state
    messages = state.get("messages", [])
    last_message = messages[-1] if messages else None
    
    if not last_message:
        log.warning("No messages in state, ending conversation")
        return "__end__"
    
    # Extract supervisor decision from message content
    content = last_message.content.lower()
    
    # Simple routing logic based on keywords
    # In production, this would use LLM-based routing with structured output
    
    if "finish" in content or "complete" in content or "done" in content:
        return "__end__"
    
    # Route to appropriate agent based on task keywords
    if any(word in content for word in ["read", "file", "upload", "document", "pdf", "csv"]):
        return "data_ingestion_agent"
    
    if any(word in content for word in ["analyze", "calculate", "compute", "average", "statistics"]):
        return "analysis_agent"
    
    if any(word in content for word in ["query", "database", "conversation", "history", "show me"]):
        return "query_agent"
    
    if any(word in content for word in ["report", "summary", "generate", "create"]):
        return "report_agent"
    
    if any(word in content for word in ["notify", "alert", "send", "message"]):
        return "notification_agent"
    
    # Default: delegate to analysis agent for general queries
    return "analysis_agent"


# =============================================================================
# SUPERVISOR AGENT CLASS
# =============================================================================
class SupervisorAgent(BaseAgent):
    """
    Supervisor agent that orchestrates worker agents.
    
    The supervisor:
    1. Analyzes user requests
    2. Breaks down complex tasks
    3. Routes to appropriate worker agents
    4. Synthesizes final responses
    """
    
    def __init__(self):
        super().__init__(
            name="supervisor",
            system_prompt=SUPERVISOR_SYSTEM_PROMPT,
            tools=[],  # Supervisor doesn't use tools directly
            temperature=0.0,  # Deterministic routing decisions
        )
    
    async def route_task(self, user_message: str, context: dict = None) -> str:
        """
        Determine which worker agent should handle the task.
        
        PROMPT ENGINEERING:
        We use a structured prompt that forces the LLM to choose from
        a predefined list of agents or "FINISH".
        
        Args:
            user_message: User's request
            context: Additional context
        
        Returns:
            Agent name to invoke
        """
        
        routing_prompt = f"""Given the user request, determine which specialist agent should handle it.

Available agents:
- data_ingestion_agent: Read and process files
- analysis_agent: Perform calculations and analysis
- query_agent: Query database for information
- report_agent: Generate reports and summaries
- notification_agent: Send alerts and notifications

User request: {user_message}

Respond with ONLY the agent name, or "FINISH" if no agent needed.
"""
        
        result = await self.execute(routing_prompt, context)
        
        if result["success"]:
            decision = result["output"].strip().lower()
            
            # Validate decision
            if decision == "finish":
                return "__end__"
            elif decision.replace("_", "") in [a.replace("_", "") for a in WORKER_AGENTS]:
                return decision
            else:
                # Default fallback
                log.warning(f"Invalid routing decision: {decision}, defaulting to analysis_agent")
                return "analysis_agent"
        
        return "__end__"
```

***

### **File 5: `backend/app/agents/data_ingestion_agent.py`**

```python
"""
Data Ingestion Agent

Specializes in reading and processing files of various formats.
Uses the read_file_tool from MCP server.
"""

from app.agents.base_agent import BaseAgent
from app.mcp.tools.file_reader import read_file_tool
from app.utils.prompts import DATA_INGESTION_AGENT_PROMPT


class DataIngestionAgent(BaseAgent):
    """
    Agent specialized in file reading and data extraction.
    
    Capabilities:
    - Read text, JSON, CSV, PDF, Markdown files
    - Extract structured data
    - Summarize file contents
    - Handle large files efficiently
    """
    
    def __init__(self):
        # Convert MCP tool to LangChain-compatible format
        tools = [
            {
                "name": "read_file",
                "description": "Read and parse files (text, JSON, CSV, PDF, Markdown)",
                "function": read_file_tool,
            }
        ]
        
        super().__init__(
            name="data_ingestion_agent",
            system_prompt=DATA_INGESTION_AGENT_PROMPT,
            tools=tools,
            temperature=0.2,  # Low creativity for consistent parsing
        )
```

***

### **File 6: `backend/app/agents/analysis_agent.py`**

```python
"""
Analysis Agent

Specializes in data analysis, calculations, and statistical processing.
"""

from app.agents.base_agent import BaseAgent
from app.utils.prompts import ANALYSIS_AGENT_PROMPT


class AnalysisAgent(BaseAgent):
    """
    Agent specialized in data analysis and calculations.
    
    Capabilities:
    - Statistical analysis (mean, median, std dev)
    - Mathematical calculations
    - Data aggregations
    - Trend analysis
    - Pattern recognition
    """
    
    def __init__(self):
        # Analysis agent uses LLM's built-in calculation capabilities
        # No external tools needed for Phase 1
        super().__init__(
            name="analysis_agent",
            system_prompt=ANALYSIS_AGENT_PROMPT,
            tools=[],
            temperature=0.3,  # Moderate for balanced analytical reasoning
        )
```

***

### **File 7: `backend/app/agents/query_agent.py`**

```python
"""
Query Agent

Specializes in database queries and data retrieval.
Uses the query_database_tool from MCP server.
"""

from app.agents.base_agent import BaseAgent
from app.mcp.tools.database_connector import query_database_tool
from app.utils.prompts import QUERY_AGENT_PROMPT


class QueryAgent(BaseAgent):
    """
    Agent specialized in SQL query generation and database access.
    
    Capabilities:
    - Generate SQL queries
    - Retrieve conversation history
    - Fetch agent performance metrics
    - Search and filter data
    """
    
    def __init__(self):
        tools = [
            {
                "name": "query_database",
                "description": "Execute SQL queries on PostgreSQL database",
                "function": query_database_tool,
            }
        ]
        
        super().__init__(
            name="query_agent",
            system_prompt=QUERY_AGENT_PROMPT,
            tools=tools,
            temperature=0.0,  # Deterministic SQL generation
        )
```

***

### **File 8: `backend/app/agents/report_agent.py`**

```python
"""
Report Agent

Specializes in generating formatted reports and summaries.
"""

from app.agents.base_agent import BaseAgent
from app.utils.prompts import REPORT_AGENT_PROMPT


class ReportAgent(BaseAgent):
    """
    Agent specialized in report generation and formatting.
    
    Capabilities:
    - Create executive summaries
    - Generate detailed reports
    - Format data in Markdown/tables
    - Highlight key insights
    - Structure information clearly
    """
    
    def __init__(self):
        super().__init__(
            name="report_agent",
            system_prompt=REPORT_AGENT_PROMPT,
            tools=[],
            temperature=0.7,  # Higher creativity for engaging report writing
        )
```

***

### **File 9: `backend/app/agents/notification_agent.py`**

```python
"""
Notification Agent

Specializes in sending alerts and notifications.
Uses the call_external_api_tool for webhook delivery.
"""

from app.agents.base_agent import BaseAgent
from app.mcp.tools.api_caller import call_external_api_tool
from app.utils.prompts import NOTIFICATION_AGENT_PROMPT


class NotificationAgent(BaseAgent):
    """
    Agent specialized in notifications and alerts.
    
    Capabilities:
    - Generate formatted notifications
    - Send alerts to external services
    - Handle different priority levels
    - Track notification delivery
    """
    
    def __init__(self):
        tools = [
            {
                "name": "call_api",
                "description": "Make HTTP requests to external APIs (webhooks, Slack, etc.)",
                "function": call_external_api_tool,
            }
        ]
        
        super().__init__(
            name="notification_agent",
            system_prompt=NOTIFICATION_AGENT_PROMPT,
            tools=tools,
            temperature=0.5,  # Balanced for clear communication
        )
```

***

Due to length constraints, let me provide the LangGraph state and workflow implementation in the next message. Reply "continue" for the final part!



---




Perfect! Let's complete the LangGraph implementation with state management and the supervisor workflow.

***

### **File 10: `backend/app/graphs/__init__.py`**

```python
"""LangGraph workflows for multi-agent orchestration."""
```

***

### **File 11: `backend/app/graphs/state.py`**

```python
"""
LangGraph State Definitions

This module defines the state structure for the multi-agent workflow.

State in LangGraph:
- TypedDict defines the shape of data flowing through the graph
- Annotated fields specify how state updates are merged
- Messages accumulate (add_messages), other fields replace

DESIGN PRINCIPLES:
1. Keep state minimal - only what's needed across agents
2. Use messages for conversation history
3. Store metadata for debugging and monitoring
4. Enable checkpointing for conversation persistence
"""

from typing import Annotated, Literal, Optional, TypedDict

from langgraph.graph.message import add_messages


class AgentState(TypedDict):
    """
    State structure for the multi-agent system.
    
    This state flows through the LangGraph workflow and is accessible
    to all agents and nodes.
    
    PROMPT ENGINEERING NOTE:
    Messages are the primary way agents communicate. Each agent:
    1. Reads previous messages for context
    2. Executes its task
    3. Adds its output as a new message
    4. Supervisor reads all messages to make routing decisions
    
    Attributes:
        messages: Conversation history (accumulated with add_messages)
        next_agent: Which agent should execute next
        current_agent: Currently executing agent
        conversation_id: ID for conversation persistence
        user_id: User who initiated the conversation
        metadata: Additional context and execution metadata
    """
    
    # Messages accumulate (don't replace)
    # add_messages handles appending, deduplication, and ordering
    messages: Annotated[list, add_messages]
    
    # Routing information (replaces on each update)
    next_agent: Optional[str]
    current_agent: Optional[str]
    
    # Conversation tracking
    conversation_id: Optional[str]
    user_id: Optional[str]
    
    # Execution metadata
    metadata: dict
    
    # Agent execution tracking
    execution_count: int
    max_iterations: int


class SupervisorState(TypedDict):
    """
    Extended state for supervisor-specific information.
    
    Used when supervisor needs to make complex routing decisions
    or track multi-step delegations.
    """
    messages: Annotated[list, add_messages]
    next_agent: str
    task_queue: list[str]  # Queue of pending tasks
    completed_agents: list[str]  # Agents that have already executed
    metadata: dict


def create_initial_state(
    user_message: str,
    conversation_id: Optional[str] = None,
    user_id: Optional[str] = None,
) -> AgentState:
    """
    Create initial state for a new conversation.
    
    Args:
        user_message: User's initial message
        conversation_id: Optional conversation ID for resuming
        user_id: Optional user identifier
    
    Returns:
        Initialized AgentState
    """
    from langchain_core.messages import HumanMessage
    
    return AgentState(
        messages=[HumanMessage(content=user_message)],
        next_agent="supervisor",  # Always start with supervisor
        current_agent=None,
        conversation_id=conversation_id,
        user_id=user_id,
        metadata={
            "started_at": None,  # Will be set by workflow
            "status": "initiated",
        },
        execution_count=0,
        max_iterations=10,  # Prevent infinite loops
    )
```

***

### **File 12: `backend/app/graphs/supervisor_graph.py`**

```python
"""
Supervisor Multi-Agent Workflow using LangGraph

This module implements the supervisor pattern where:
1. User sends message to supervisor
2. Supervisor analyzes and routes to worker agent
3. Worker executes task and returns result
4. Supervisor decides next step (continue or finish)
5. Final result returned to user

ARCHITECTURE:
┌─────────┐
│  User   │
└────┬────┘
     │
     v
┌────────────┐      ┌──────────────────┐
│ Supervisor │─────>│ Data Ingestion   │
└────┬───────┘      │ Analysis         │
     │              │ Query            │
     │              │ Report           │
     │              │ Notification     │
     │              └────────┬─────────┘
     │                       │
     v                       v
  Finish <─────────── Agent Result
  
CHECKPOINTING:
- LangGraph saves state after each node execution
- Enables conversation memory (agents remember previous interactions)
- Allows resuming conversations after interruption
- Provides time-travel debugging (inspect state at any point)
"""

import uuid
from typing import Literal

from langchain_core.messages import AIMessage, HumanMessage
from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.graph import END, START, StateGraph

from app.agents.analysis_agent import AnalysisAgent
from app.agents.data_ingestion_agent import DataIngestionAgent
from app.agents.notification_agent import NotificationAgent
from app.agents.query_agent import QueryAgent
from app.agents.report_agent import ReportAgent
from app.agents.supervisor import SupervisorAgent
from app.config import settings
from app.graphs.state import AgentState
from app.utils.logger import get_logger

log = get_logger(__name__)


# =============================================================================
# INITIALIZE AGENTS
# =============================================================================
# Create singleton instances of all agents
supervisor = SupervisorAgent()
data_ingestion_agent = DataIngestionAgent()
analysis_agent = AnalysisAgent()
query_agent = QueryAgent()
report_agent = ReportAgent()
notification_agent = NotificationAgent()


# =============================================================================
# AGENT NODE FUNCTIONS
# =============================================================================
async def supervisor_node(state: AgentState) -> AgentState:
    """
    Supervisor node: Routes to appropriate worker agent.
    
    The supervisor analyzes the conversation history and decides
    which agent should handle the task, or if processing is complete.
    
    Args:
        state: Current workflow state
    
    Returns:
        Updated state with routing decision
    """
    log.info("Supervisor analyzing request")
    
    # Get last message (user's request or previous agent output)
    messages = state["messages"]
    last_message = messages[-1].content if messages else ""
    
    # Supervisor makes routing decision
    result = await supervisor.route_task(
        user_message=last_message,
        context=state.get("metadata", {}),
    )
    
    # Update state with routing decision
    state["next_agent"] = result
    state["current_agent"] = "supervisor"
    state["execution_count"] = state.get("execution_count", 0) + 1
    
    log.info(
        "Supervisor routing decision",
        next_agent=result,
        execution_count=state["execution_count"],
    )
    
    return state


async def data_ingestion_node(state: AgentState) -> AgentState:
    """Data ingestion agent node."""
    log.info("Data ingestion agent executing")
    
    # Get task from last message
    messages = state["messages"]
    task = messages[-1].content if messages else ""
    
    # Execute agent
    result = await data_ingestion_agent.execute(
        message=task,
        context=state.get("metadata", {}),
    )
    
    # Add agent output to messages
    output_message = AIMessage(
        content=result["output"] if result["success"] else f"Error: {result.get('error')}",
        name="data_ingestion_agent",
    )
    
    state["messages"].append(output_message)
    state["current_agent"] = "data_ingestion_agent"
    state["next_agent"] = "supervisor"  # Return to supervisor for next routing
    
    return state


async def analysis_node(state: AgentState) -> AgentState:
    """Analysis agent node."""
    log.info("Analysis agent executing")
    
    messages = state["messages"]
    task = messages[-1].content if messages else ""
    
    result = await analysis_agent.execute(
        message=task,
        context=state.get("metadata", {}),
    )
    
    output_message = AIMessage(
        content=result["output"] if result["success"] else f"Error: {result.get('error')}",
        name="analysis_agent",
    )
    
    state["messages"].append(output_message)
    state["current_agent"] = "analysis_agent"
    state["next_agent"] = "supervisor"
    
    return state


async def query_node(state: AgentState) -> AgentState:
    """Query agent node."""
    log.info("Query agent executing")
    
    messages = state["messages"]
    task = messages[-1].content if messages else ""
    
    result = await query_agent.execute(
        message=task,
        context=state.get("metadata", {}),
    )
    
    output_message = AIMessage(
        content=result["output"] if result["success"] else f"Error: {result.get('error')}",
        name="query_agent",
    )
    
    state["messages"].append(output_message)
    state["current_agent"] = "query_agent"
    state["next_agent"] = "supervisor"
    
    return state


async def report_node(state: AgentState) -> AgentState:
    """Report agent node."""
    log.info("Report agent executing")
    
    messages = state["messages"]
    task = messages[-1].content if messages else ""
    
    result = await report_agent.execute(
        message=task,
        context=state.get("metadata", {}),
    )
    
    output_message = AIMessage(
        content=result["output"] if result["success"] else f"Error: {result.get('error')}",
        name="report_agent",
    )
    
    state["messages"].append(output_message)
    state["current_agent"] = "report_agent"
    state["next_agent"] = "supervisor"
    
    return state


async def notification_node(state: AgentState) -> AgentState:
    """Notification agent node."""
    log.info("Notification agent executing")
    
    messages = state["messages"]
    task = messages[-1].content if messages else ""
    
    result = await notification_agent.execute(
        message=task,
        context=state.get("metadata", {}),
    )
    
    output_message = AIMessage(
        content=result["output"] if result["success"] else f"Error: {result.get('error')}",
        name="notification_agent",
    )
    
    state["messages"].append(output_message)
    state["current_agent"] = "notification_agent"
    state["next_agent"] = "supervisor"
    
    return state


# =============================================================================
# ROUTING FUNCTION
# =============================================================================
def router(
    state: AgentState,
) -> Literal["data_ingestion_agent", "analysis_agent", "query_agent", "report_agent", "notification_agent", "supervisor", END]:
    """
    Route to next node based on state.
    
    This function is called after each node execution to determine
    where to go next in the workflow.
    
    Args:
        state: Current workflow state
    
    Returns:
        Next node name or END
    """
    next_agent = state.get("next_agent", END)
    execution_count = state.get("execution_count", 0)
    max_iterations = state.get("max_iterations", 10)
    
    # Safety check: prevent infinite loops
    if execution_count >= max_iterations:
        log.warning(
            "Max iterations reached, ending workflow",
            execution_count=execution_count,
        )
        return END
    
    # Route based on next_agent
    if next_agent == "__end__" or next_agent == END:
        return END
    
    log.info("Routing to next node", next_agent=next_agent)
    return next_agent


# =============================================================================
# BUILD WORKFLOW GRAPH
# =============================================================================
def create_supervisor_workflow() -> StateGraph:
    """
    Create the supervisor multi-agent workflow graph.
    
    Graph structure:
    START -> supervisor -> [worker agents] -> supervisor -> END
    
    The supervisor acts as a hub, routing tasks to specialized workers
    and synthesizing their outputs.
    
    Returns:
        Compiled StateGraph ready for execution
    """
    
    # Initialize graph
    workflow = StateGraph(AgentState)
    
    # Add nodes
    workflow.add_node("supervisor", supervisor_node)
    workflow.add_node("data_ingestion_agent", data_ingestion_node)
    workflow.add_node("analysis_agent", analysis_node)
    workflow.add_node("query_agent", query_node)
    workflow.add_node("report_agent", report_node)
    workflow.add_node("notification_agent", notification_node)
    
    # Define edges
    # Start -> Supervisor
    workflow.add_edge(START, "supervisor")
    
    # Supervisor -> Workers (conditional routing)
    workflow.add_conditional_edges(
        "supervisor",
        router,
        {
            "data_ingestion_agent": "data_ingestion_agent",
            "analysis_agent": "analysis_agent",
            "query_agent": "query_agent",
            "report_agent": "report_agent",
            "notification_agent": "notification_agent",
            END: END,
        },
    )
    
    # Workers -> Supervisor (return for next routing decision)
    for agent in ["data_ingestion_agent", "analysis_agent", "query_agent", "report_agent", "notification_agent"]:
        workflow.add_edge(agent, "supervisor")
    
    log.info("Supervisor workflow graph created")
    
    return workflow


# =============================================================================
# INITIALIZE CHECKPOINTER
# =============================================================================
async def get_checkpointer() -> PostgresSaver:
    """
    Create PostgreSQL checkpointer for conversation persistence.
    
    The checkpointer:
    - Saves state after each node execution
    - Enables conversation memory
    - Allows resuming interrupted conversations
    - Provides debugging capabilities
    
    Returns:
        Configured PostgresSaver instance
    """
    checkpointer = PostgresSaver.from_conn_string(settings.DATABASE_URL)
    
    # Setup tables (idempotent - safe to call multiple times)
    await checkpointer.setup()
    
    log.info("Checkpointer initialized")
    
    return checkpointer


# =============================================================================
# COMPILE WORKFLOW
# =============================================================================
async def get_supervisor_graph():
    """
    Get compiled supervisor workflow graph with checkpointing.
    
    This is the main entry point for executing the multi-agent workflow.
    
    Returns:
        Compiled graph ready for invocation
    """
    workflow = create_supervisor_workflow()
    checkpointer = await get_checkpointer()
    
    # Compile with checkpointing enabled
    graph = workflow.compile(checkpointer=checkpointer)
    
    log.info("Supervisor graph compiled with checkpointing")
    
    return graph


# =============================================================================
# EXECUTION HELPER
# =============================================================================
async def execute_agent_workflow(
    user_message: str,
    conversation_id: Optional[str] = None,
    user_id: Optional[str] = None,
) -> dict:
    """
    Execute the multi-agent workflow.
    
    This is a convenience function that:
    1. Creates initial state
    2. Gets compiled graph
    3. Executes workflow
    4. Returns final result
    
    Args:
        user_message: User's request
        conversation_id: Optional conversation ID to resume
        user_id: Optional user identifier
    
    Returns:
        Workflow execution result with messages and metadata
    """
    from app.graphs.state import create_initial_state
    
    # Generate conversation ID if not provided
    if not conversation_id:
        conversation_id = str(uuid.uuid4())
    
    # Create initial state
    initial_state = create_initial_state(
        user_message=user_message,
        conversation_id=conversation_id,
        user_id=user_id,
    )
    
    # Get compiled graph
    graph = await get_supervisor_graph()
    
    # Execute workflow with checkpointing
    config = {
        "configurable": {
            "thread_id": conversation_id,  # For conversation persistence
        }
    }
    
    log.info(
        "Starting workflow execution",
        conversation_id=conversation_id,
        user_id=user_id,
    )
    
    try:
        # Invoke graph (async streaming execution)
        final_state = await graph.ainvoke(initial_state, config)
        
        # Extract final message (last agent output)
        messages = final_state.get("messages", [])
        final_message = messages[-1].content if messages else "No response generated"
        
        log.info(
            "Workflow execution completed",
            conversation_id=conversation_id,
            message_count=len(messages),
        )
        
        return {
            "success": True,
            "conversation_id": conversation_id,
            "response": final_message,
            "messages": [{"role": m.type, "content": m.content} for m in messages],
            "metadata": final_state.get("metadata", {}),
        }
    
    except Exception as e:
        log.error(
            "Workflow execution failed",
            conversation_id=conversation_id,
            exc_info=e,
        )
        
        return {
            "success": False,
            "conversation_id": conversation_id,
            "error": str(e),
        }
```

***

## **CHECKPOINT TEST: Verify Agent System**

```bash
# From backend/ directory with venv activated

# Test individual agents
python -c "
import asyncio
from app.agents.analysis_agent import AnalysisAgent

async def test():
    agent = AnalysisAgent()
    result = await agent.execute('What is 25% of 80?')
    print(f'✓ Analysis Agent: {result[\"output\"][:100]}...')

asyncio.run(test())
"

# Test supervisor workflow
python -c "
import asyncio
from app.graphs.supervisor_graph import execute_agent_workflow

async def test():
    result = await execute_agent_workflow(
        user_message='Calculate the average of these numbers: 10, 20, 30, 40, 50',
        user_id='test_user'
    )
    
    print(f'✓ Workflow Success: {result[\"success\"]}')
    print(f'  Conversation ID: {result[\"conversation_id\"]}')
    print(f'  Response: {result[\"response\"][:200]}...')
    print(f'  Message Count: {len(result[\"messages\"])}')

asyncio.run(test())
"

# Test with file reading (create test file first)
python -c "
import asyncio
import os
from app.graphs.supervisor_graph import execute_agent_workflow

async def test():
    # Create test file
    os.makedirs('data/uploads', exist_ok=True)
    with open('data/uploads/test_data.txt', 'w') as f:
        f.write('Sales data: Q1=100, Q2=150, Q3=200, Q4=180')
    
    result = await execute_agent_workflow(
        user_message='Read the file data/uploads/test_data.txt and analyze the quarterly sales',
        user_id='test_user'
    )
    
    print(f'✓ File + Analysis Workflow: {result[\"success\"]}')
    print(f'  Response: {result[\"response\"][:300]}...')

asyncio.run(test())
"
```

***

## **WHAT YOU'VE LEARNED**

1. ✅ **Prompt Engineering**: Comprehensive system prompts guiding agent behavior
2. ✅ **Base Agent Pattern**: Reusable class with LLM initialization and tool binding
3. ✅ **Specialized Agents**: 5 worker agents + 1 supervisor with distinct roles
4. ✅ **LangGraph State**: TypedDict with annotated fields for state management
5. ✅ **Supervisor Pattern**: Hub-and-spoke architecture with intelligent routing
6. ✅ **Checkpointing**: PostgreSQL-backed conversation persistence
7. ✅ **Workflow Orchestration**: Conditional edges and routing logic

***

**Reply "done" when all agents are working correctly, then we'll proceed to Step 1.5: FastAPI Routes & WebSockets (the final step of Phase 1!)**[1][2][3][4][5]

[1](https://langchain-ai.github.io/langgraph/tutorials/multi_agent/agent_supervisor/)
[2](https://github.com/langchain-ai/langgraph-supervisor-py)
[3](https://dev.to/sreeni5018/building-multi-agent-systems-with-langgraph-supervisor-138i)
[4](https://python.langchain.com/docs/tutorials/agents/)
[5](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.react.agent.create_react_agent.html)