# Chapter 20: Capstone Project - Building a Complete Agent System
**From: Zero to AI Agent**

## Overview

This capstone project guides you through building **CASPAR** - a production-ready Customer Service AI Agent using everything you've learned in this book.

**CASPAR** = **C**ustomer **A**ssistant for **S**upport, **P**roblem-solving, **A**nd **R**esolution

### What CASPAR Does
- Answers customer questions using a knowledge base (RAG)
- Looks up orders and provides status updates
- Detects frustrated customers and escalates appropriately
- Creates support tickets when issues need follow-up
- Transfers to human agents when needed
- Remembers conversation context with PostgreSQL persistence

### Project Structure
The complete CASPAR project is in the `caspar/` folder alongside this notebook. The code below is for **reference** - to run CASPAR, use the actual project files:

```bash
cd caspar
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -r requirements.txt
pip install -e .
docker compose up -d  # Start PostgreSQL
python scripts/verify_setup.py
```

### Sections
- 20.1: Project setup and configuration
- 20.2: Agent architecture design (LangGraph)
- 20.3: Knowledge retrieval (RAG with ChromaDB)
- 20.4: Conversation flow and tools
- 20.5: Human handoff system
- 20.6: Testing and evaluation
- 20.7: Deployment with Docker and FastAPI

In [None]:
# This notebook is for REFERENCE only
# To run CASPAR, navigate to the caspar/ folder and follow the setup instructions above

# Quick check that the caspar folder exists
import os
caspar_path = os.path.join(os.path.dirname(os.path.abspath("__file__")), "caspar")
if os.path.exists(caspar_path):
    print(f"✓ CASPAR project found at: {caspar_path}")
    print(f"  Files: {len(os.listdir(caspar_path))} items")
else:
    print("⚠ CASPAR folder not found - please ensure caspar/ is in the same directory")

---
## Section 20.1: Project overview: Customer service automation agent

This section sets up the CASPAR project - **C**ustomer **A**ssistant for **S**upport, **P**roblem-solving, **A**nd **R**esolution.

**Key files:**
- `requirements.txt` - Project dependencies
- `pyproject.toml` - Package configuration
- `docker-compose.yml` - PostgreSQL database setup
- `src/caspar/config/settings.py` - Configuration management
- `src/caspar/config/logging.py` - Structured logging
- `scripts/verify_setup.py` - Setup verification script

### requirements.txt

In [None]:
# Save as: requirements.txt

# Core LLM and Agent frameworks
langchain==1.1.1
langchain-core==1.1.0
langchain-openai==1.1.0
langchain-text-splitters==1.0.0
langgraph==1.0.4
langgraph-checkpoint-postgres==3.0.1

# Vector database for RAG
chromadb==1.3.5
langchain-chroma==1.0.0

# API framework
fastapi==0.123.5
uvicorn==0.38.0
python-multipart==0.0.20

# Database
psycopg==3.3.1
psycopg-binary==3.3.1
psycopg-pool==3.3.0
asyncpg==0.31.0

# Configuration and utilities
pydantic==2.12.5
pydantic-settings==2.12.0
python-dotenv==1.2.1

# Logging and monitoring
structlog==25.5.0

# Testing
pytest==9.0.1
pytest-asyncio==1.3.0
pytest-cov==7.0.0
pytest-timeout==2.4.0
httpx==0.28.1


### pyproject.toml

In [None]:
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "caspar"
version = "1.0.0"
description = "CASPAR - Customer Assistance System for Product and Account Resolution"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
    # Core LLM and Agent frameworks
    "langchain==1.1.1",
    "langchain-core==1.1.0",
    "langchain-openai==1.1.0",
    "langchain-text-splitters==1.0.0",
    "langgraph==1.0.4",
    "langgraph-checkpoint-postgres==3.0.1",
    
    # Vector database for RAG
    "chromadb==1.3.5",
    "langchain-chroma==1.0.0",
    
    # API framework
    "fastapi==0.123.5",
    "uvicorn==0.38.0",
    "python-multipart==0.0.20",
    
    # Database
    "psycopg==3.3.1",
    "psycopg-binary==3.3.1",
    "psycopg-pool==3.3.0",
    "asyncpg==0.31.0",
    
    # Configuration and utilities
    "pydantic==2.12.5",
    "pydantic-settings==2.12.0",
    "python-dotenv==1.2.1",
    
    # Logging and monitoring
    "structlog==25.5.0",
]

[project.optional-dependencies]
dev = [
    "pytest==9.0.1",
    "pytest-asyncio==1.3.0",
    "pytest-cov==7.0.0",
    "pytest-timeout==2.4.0",
    "httpx==0.28.1",
]

[tool.setuptools.packages.find]
where = ["src"]

[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]
addopts = "-v --tb=short"


### docker-compose.yml

In [None]:
# Save as: docker-compose.yml

version: '3.8'

services:
  caspar:
    build: .
    ports:
      - "8000:8000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - DATABASE_URL=postgresql://caspar:caspar_secret@postgres:5432/caspar_db
      - LOG_LEVEL=INFO
      - ENVIRONMENT=development
    volumes:
      # Mount for development hot-reload
      - ./src:/app/src:ro
      - ./data:/app/data:ro
    depends_on:
      postgres:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      start_period: 40s
      retries: 3

  # PostgreSQL for conversation persistence
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: caspar
      POSTGRES_PASSWORD: caspar_secret
      POSTGRES_DB: caspar_db
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U caspar -d caspar_db"]
      interval: 10s
      timeout: 5s
      retries: 5

  # Optional: Redis for horizontal scaling session management
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

volumes:
  postgres_data:
  redis_data:


### .env.example

In [None]:
# Save as: .env.example

# OpenAI API key (required)
OPENAI_API_KEY=your-openai-api-key-here

# PostgreSQL database URL for conversation persistence (required for production)
# Format: postgresql://user:password@host:port/database
DATABASE_URL=postgresql://caspar:caspar_secret@localhost:5432/caspar_db

# Logging level (DEBUG, INFO, WARNING, ERROR)
LOG_LEVEL=INFO

# Environment (development, production)
ENVIRONMENT=development

# Optional: Model configuration
OPENAI_MODEL=gpt-4o-mini
OPENAI_TEMPERATURE=0.1

# Optional: API port (Railway sets this automatically)
PORT=8000


### config/settings.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.1
# File: src/caspar/config/settings.py

"""
CASPAR Configuration Settings

This module provides centralized configuration management using Pydantic Settings.
All configuration is loaded from environment variables with sensible defaults.
"""

from functools import lru_cache
from pathlib import Path
from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import Field


def get_project_root() -> Path:
    """Find the project root directory (where .env lives)."""
    # Start from this file's directory and go up until we find .env or pyproject.toml
    current = Path(__file__).resolve().parent
    
    for parent in [current] + list(current.parents):
        if (parent / ".env").exists() or (parent / "pyproject.toml").exists():
            return parent
    
    # Fallback to current working directory
    return Path.cwd()


# Get path to .env file
PROJECT_ROOT = get_project_root()
ENV_FILE = PROJECT_ROOT / ".env"


class Settings(BaseSettings):
    """Application settings loaded from environment variables."""
    
    # Pydantic Settings v2 configuration
    model_config = SettingsConfigDict(
        env_file=str(ENV_FILE),
        env_file_encoding="utf-8",
        case_sensitive=False,
        extra="ignore",  # Ignore extra env vars
    )
    
    # OpenAI Configuration
    openai_api_key: str = Field(..., description="OpenAI API key")
    default_model: str = Field(
        default="gpt-4o-mini",
        description="Default LLM model for most operations"
    )
    smart_model: str = Field(
        default="gpt-4o",
        description="Smarter model for complex reasoning"
    )
    
    # Database Configuration
    database_url: str = Field(
        default="postgresql://caspar:caspar_secret@localhost:5432/caspar_db",
        description="PostgreSQL connection string"
    )
    
    # Application Settings
    environment: str = Field(
        default="development",
        description="Environment (development, staging, production)"
    )
    debug: bool = Field(
        default=False,
        description="Enable debug mode"
    )
    log_level: str = Field(
        default="INFO",
        description="Logging level"
    )
    
    # Agent Configuration
    max_conversation_turns: int = Field(
        default=50,
        description="Maximum turns before suggesting human handoff"
    )
    sentiment_threshold: float = Field(
        default=-0.5,
        description="Sentiment score below which to escalate"
    )
    
    # RAG Configuration
    chroma_persist_directory: str = Field(
        default="./chroma_data",
        description="Directory for ChromaDB persistence"
    )
    retrieval_k: int = Field(
        default=4,
        description="Number of documents to retrieve for RAG"
    )
    
    # API Configuration
    api_host: str = Field(default="0.0.0.0", description="API host")
    api_port: int = Field(default=8000, description="API port")


@lru_cache()
def get_settings() -> Settings:
    """
    Get cached settings instance.
    
    Using lru_cache ensures we only load settings once,
    improving performance and consistency.
    """
    return Settings()


# Convenience function for quick access
settings = get_settings()


### config/logging.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.1
# File: src/caspar/config/logging.py

"""
CASPAR Logging Configuration

Provides structured logging using structlog for better
observability in production environments.
"""

import logging
import structlog
from .settings import settings


def setup_logging() -> None:
    """Configure structured logging for the application."""
    
    # Set the log level based on settings
    log_level = getattr(logging, settings.log_level.upper(), logging.INFO)
    
    # Configure structlog
    structlog.configure(
        processors=[
            structlog.contextvars.merge_contextvars,
            structlog.processors.add_log_level,
            structlog.processors.StackInfoRenderer(),
            structlog.dev.set_exc_info,
            structlog.processors.TimeStamper(fmt="iso"),
            # Use console renderer in development, JSON in production
            structlog.dev.ConsoleRenderer()
            if settings.environment == "development"
            else structlog.processors.JSONRenderer(),
        ],
        wrapper_class=structlog.make_filtering_bound_logger(log_level),
        context_class=dict,
        logger_factory=structlog.PrintLoggerFactory(),
        cache_logger_on_first_use=True,
    )
    
    # Also configure standard logging for third-party libraries
    logging.basicConfig(
        format="%(message)s",
        level=log_level,
    )


def get_logger(name: str) -> structlog.BoundLogger:
    """Get a logger instance with the given name."""
    return structlog.get_logger(name)


### scripts/verify_setup.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.1
# File: scripts/verify_setup.py

"""
Setup Verification Script

Run this to ensure your CASPAR development environment is properly configured.
"""

import sys
from pathlib import Path


def check_python_version():
    """Verify Python version is 3.11+"""
    version = sys.version_info
    if version.major < 3 or (version.major == 3 and version.minor < 11):
        print(f"❌ Python 3.11+ required, found {version.major}.{version.minor}")
        return False
    print(f"✅ Python {version.major}.{version.minor}.{version.micro}")
    return True


def check_dependencies():
    """Verify all required packages are installed."""
    required = [
        ("langchain", "langchain"),
        ("langchain_openai", "langchain-openai"),
        ("langchain_text_splitters", "langchain-text-splitters"),
        ("langgraph", "langgraph"),
        ("chromadb", "chromadb"),
        ("fastapi", "fastapi"),
        ("pydantic_settings", "pydantic-settings"),
        ("structlog", "structlog"),
        ("psycopg", "psycopg"),
    ]
    
    all_good = True
    for module_name, package_name in required:
        try:
            __import__(module_name)
            print(f"✅ {package_name}")
        except ImportError:
            print(f"❌ {package_name} - run: pip install {package_name}")
            all_good = False
    
    return all_good


def check_caspar_installed():
    """Verify the caspar package is installed in editable mode."""
    try:
        import caspar
        print("✅ caspar package is installed")
        return True
    except ImportError:
        print("❌ caspar package not found")
        print("   Run: pip install -e .")
        print("   (Make sure you're in the project root where pyproject.toml is)")
        return False


def check_env_file():
    """Verify .env file exists and has required variables."""
    # Find .env relative to this script's location
    env_path = Path(__file__).parent.parent / ".env"
    
    if not env_path.exists():
        print("❌ .env file not found")
        print("   Create one with: OPENAI_API_KEY=sk-your-key-here")
        return False
    
    content = env_path.read_text()
    
    if "OPENAI_API_KEY" not in content:
        print("❌ OPENAI_API_KEY not found in .env")
        return False
    
    if "sk-your" in content or "sk-xxx" in content:
        print("⚠️  .env found but OPENAI_API_KEY appears to be a placeholder")
        print("   Replace it with your actual API key")
        return False
    
    print("✅ .env file configured")
    return True


def check_configuration():
    """Verify configuration loads correctly."""
    try:
        from caspar.config import settings
        
        # Check that we can access settings
        _ = settings.openai_api_key
        _ = settings.default_model
        
        print(f"✅ Configuration loaded")
        print(f"   Environment: {settings.environment}")
        print(f"   Default model: {settings.default_model}")
        return True
        
    except Exception as e:
        print(f"❌ Configuration error: {e}")
        return False


def check_openai_connection():
    """Verify OpenAI API connection works."""
    try:
        from caspar.config import settings
        from langchain_openai import ChatOpenAI
        
        llm = ChatOpenAI(
            model=settings.default_model,
            api_key=settings.openai_api_key,
            max_tokens=10
        )
        
        # Make a minimal test call
        response = llm.invoke("Say 'OK' and nothing else.")
        
        print(f"✅ OpenAI API connection successful")
        print(f"   Model: {settings.default_model}")
        return True
        
    except Exception as e:
        print(f"❌ OpenAI API error: {e}")
        return False


def check_database_connection():
    """Verify PostgreSQL database connection works."""
    try:
        import psycopg
        from caspar.config import settings
        
        with psycopg.connect(settings.database_url) as conn:
            with conn.cursor() as cur:
                cur.execute("SELECT version();")
                version = cur.fetchone()[0]
                print(f"✅ PostgreSQL connection successful")
                print(f"   {version[:50]}...")
                return True
                
    except ImportError:
        print(f"❌ psycopg not installed - run: pip install psycopg")
        return False
    except Exception as e:
        print(f"❌ Database connection failed: {e}")
        print("   Make sure PostgreSQL is running: docker compose up -d")
        return False


def check_directory_structure():
    """Verify project directory structure is correct."""
    base_path = Path(__file__).parent.parent
    
    required_dirs = [
        "src/caspar/agent",
        "src/caspar/api",
        "src/caspar/knowledge",
        "src/caspar/tools",
        "src/caspar/handoff",
        "src/caspar/config",
        "tests/unit",
        "tests/integration",
        "tests/evaluation",
        "data/knowledge_base",
        "data/sample_data",
    ]
    
    all_good = True
    for dir_path in required_dirs:
        full_path = base_path / dir_path
        if full_path.exists():
            print(f"✅ {dir_path}/")
        else:
            print(f"❌ {dir_path}/ - missing")
            all_good = False
    
    return all_good


def main():
    """Run all verification checks."""
    print("=" * 60)
    print("🔍 CASPAR Setup Verification")
    print("=" * 60)
    
    checks = [
        ("Python Version", check_python_version),
        ("Dependencies", check_dependencies),
        ("CASPAR Package", check_caspar_installed),
        ("Directory Structure", check_directory_structure),
        ("Environment File", check_env_file),
        ("Configuration", check_configuration),
        ("Database Connection", check_database_connection),
        ("OpenAI Connection", check_openai_connection),
    ]
    
    results = []
    for name, check_func in checks:
        print(f"\n📋 Checking {name}...")
        print("-" * 40)
        results.append(check_func())
    
    print("\n" + "=" * 60)
    
    if all(results):
        print("🎉 All checks passed! You're ready to build CASPAR!")
    else:
        print("⚠️  Some checks failed. Please fix the issues above.")
        print("   Refer to the setup instructions in Section 20.1")
    
    print("=" * 60)
    
    return all(results)


if __name__ == "__main__":
    success = main()
    sys.exit(0 if success else 1)


---
## Section 20.2: Designing the agent architecture

This section defines the LangGraph agent architecture.

**Key files:**
- `src/caspar/agent/state.py` - Agent state schema
- `src/caspar/agent/nodes.py` - Graph nodes (intent classification, handlers)
- `src/caspar/agent/graph.py` - Graph construction and routing
- `src/caspar/agent/persistence.py` - PostgreSQL persistence

### agent/state.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.2
# File: src/caspar/agent/state.py

"""
CASPAR Agent State Definition

This module defines the state schema that flows through the LangGraph agent.
Every node reads from and writes to this shared state.
"""

from typing import Annotated, Literal
from typing_extensions import TypedDict
from langgraph.graph.message import add_messages
from pydantic import BaseModel, Field
from datetime import datetime, timezone


# Message handling - LangGraph's add_messages reducer handles conversation history
class AgentState(TypedDict):
    """
    The state that flows through the CASPAR agent graph.
    
    This is a TypedDict because LangGraph requires it for state management.
    Each field represents a piece of information that nodes can read or update.
    """
    
    # Conversation messages - uses add_messages reducer to append new messages
    messages: Annotated[list, add_messages]
    
    # Customer identification
    customer_id: str | None
    conversation_id: str
    
    # Intent classification results
    intent: str | None  # faq, order_inquiry, account, complaint, general, handoff_request
    confidence: float | None
    
    # Sentiment tracking
    sentiment_score: float | None  # -1.0 (very negative) to 1.0 (very positive)
    frustration_level: Literal["low", "medium", "high"] | None
    
    # Context from tools and knowledge base
    retrieved_context: str | None  # RAG results
    order_info: dict | None  # From order lookup tool
    ticket_id: str | None  # If a support ticket was created
    
    # Routing and flow control
    needs_escalation: bool
    escalation_reason: str | None
    
    # Metadata
    turn_count: int
    created_at: str
    last_updated: str


class ConversationMetadata(BaseModel):
    """Metadata about a conversation for logging and analytics."""
    
    conversation_id: str
    customer_id: str | None = None
    started_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
    ended_at: datetime | None = None
    total_turns: int = 0
    intents_detected: list[str] = Field(default_factory=list)
    escalated: bool = False
    escalation_reason: str | None = None
    sentiment_trajectory: list[float] = Field(default_factory=list)
    resolution_status: Literal["resolved", "escalated", "abandoned", "ongoing"] = "ongoing"


def create_initial_state(
    conversation_id: str,
    customer_id: str | None = None
) -> AgentState:
    """
    Create a fresh state for a new conversation.
    
    Args:
        conversation_id: Unique identifier for this conversation
        customer_id: Optional customer identifier if known
        
    Returns:
        Initial AgentState with default values
    """
    now = datetime.now(timezone.utc).isoformat()
    
    return AgentState(
        messages=[],
        customer_id=customer_id,
        conversation_id=conversation_id,
        intent=None,
        confidence=None,
        sentiment_score=None,
        frustration_level=None,
        retrieved_context=None,
        order_info=None,
        ticket_id=None,
        needs_escalation=False,
        escalation_reason=None,
        turn_count=0,
        created_at=now,
        last_updated=now,
    )


### agent/nodes.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.3
# File: src/caspar/agent/nodes.py

"""
CASPAR Agent Nodes - Processing functions for each step in the graph.

Each node function:
1. Takes the current state as input
2. Performs some processing (often using an LLM)
3. Returns updates to merge into the state
"""

from datetime import datetime, timezone

from langchain_openai import ChatOpenAI
from langchain_core.messages import AIMessage, HumanMessage

from caspar.config import settings, get_logger
from caspar.knowledge import get_retriever
from caspar.tools import (
    get_order_status,
    get_account_info,
    create_ticket,
)

logger = get_logger(__name__)


# ════════════════════════════════════════════════════════════════════════════
# Intent Classification Node
# ════════════════════════════════════════════════════════════════════════════

async def classify_intent(state: dict) -> dict:
    """
    Classify the customer's intent from their message.
    
    This is the entry point - determines which handler to route to.
    """
    logger.info("classify_intent_start", conversation_id=state.get("conversation_id"))
    
    messages = state["messages"]
    if not messages:
        return {"intent": "general"}
    
    # Get the last customer message
    last_message = messages[-1].content if messages else ""
    
    llm = ChatOpenAI(
        model=settings.default_model,
        api_key=settings.openai_api_key,
        temperature=0  # Deterministic for classification
    )
    
    classification_prompt = f"""Classify the customer's intent into ONE of these categories:

- faq: General questions about policies, products, services, shipping times, return policies, how things work
- order_inquiry: Questions about a SPECIFIC order (mentions order number, tracking number, "my order", "my package")
- account: Account-related issues (login, profile, password, settings, "my account")
- complaint: Expressing dissatisfaction, problems, wanting refunds, frustrated language
- handoff_request: Explicitly asking for a human agent, representative, or real person
- general: Anything else or unclear

IMPORTANT: 
- "How long does shipping take?" = faq (general policy question)
- "Where is my order?" or "Track order #123" = order_inquiry (specific order)

Customer message: "{last_message}"

Respond with just the category name, nothing else."""

    response = llm.invoke([HumanMessage(content=classification_prompt)])
    intent = response.content.strip().lower()
    
    # Validate intent
    valid_intents = ["faq", "order_inquiry", "account", "complaint", "handoff_request", "general"]
    if intent not in valid_intents:
        intent = "general"
    
    logger.info("classify_intent_complete", intent=intent)
    
    return {
        "intent": intent,
        "last_updated": datetime.now(timezone.utc).isoformat()
    }


# ════════════════════════════════════════════════════════════════════════════
# Intent Handler Nodes
# ════════════════════════════════════════════════════════════════════════════

async def handle_faq(state: dict) -> dict:
    """
    Handle FAQ-type questions using the knowledge base.
    """
    logger.info("handle_faq_start", conversation_id=state.get("conversation_id"))
    
    messages = state["messages"]
    last_message = messages[-1].content if messages else ""
    
    # Retrieve relevant knowledge
    retriever = get_retriever()
    docs = retriever.retrieve(last_message)
    
    context = "\n\n".join([doc.page_content for doc in docs]) if docs else ""
    
    return {
        "context": context,
        "handler_used": "faq",
        "last_updated": datetime.now(timezone.utc).isoformat()
    }


async def handle_order_inquiry(state: dict) -> dict:
    """
    Handle order-related inquiries by looking up order information.
    """
    logger.info("handle_order_inquiry_start", conversation_id=state.get("conversation_id"))
    
    messages = state["messages"]
    last_message = messages[-1].content if messages else ""
    
    # Try to extract order ID from message
    llm = ChatOpenAI(
        model=settings.default_model,
        api_key=settings.openai_api_key,
        temperature=0
    )
    
    extract_prompt = f"""Extract the order ID from this message if present.
Order IDs look like: TF-XXXXX (e.g., TF-10001) or just the number (e.g., 10001)

Message: "{last_message}"

Respond with just the order ID (e.g., TF-10001 or 10001), or "NONE" if not found."""

    response = llm.invoke([HumanMessage(content=extract_prompt)])
    order_id = response.content.strip()
    
    context = ""
    order_info = None
    
    if order_id != "NONE":
        # Normalize order ID - add TF- prefix if needed
        if not order_id.startswith("TF-"):
            # Remove any non-numeric prefix and add TF-
            numeric_part = ''.join(filter(str.isdigit, order_id))
            if numeric_part:
                order_id = f"TF-{numeric_part}"
        
        # Look up order
        order_result = get_order_status(order_id)
        if order_result["found"]:
            order = order_result["order"]
            # Extract item names from item dicts
            item_names = [item["name"] if isinstance(item, dict) else str(item) for item in order["items"]]
            order_info = {
                "order_id": order["order_id"],
                "status": order["status"],
                "items": order["items"],
                "shipping_address": order.get("shipping_address", "N/A"),
                "tracking_number": order.get("tracking_number", "Not yet available"),
                "estimated_delivery": order.get("estimated_delivery", "TBD"),
            }
            context = f"""Order Information:
- Order ID: {order['order_id']}
- Status: {order['status']}
- Items: {', '.join(item_names)}
- Shipping: {order.get('shipping_method', 'N/A')}
- Tracking: {order.get('tracking_number') or 'Not yet available'}
- Estimated Delivery: {order.get('estimated_delivery') or 'TBD'}"""
        else:
            order_info = {"error": "Order not found"}
            context = f"Order {order_id} not found in system."
    else:
        context = "No order ID provided. Ask customer for their order number."
    
    return {
        "context": context,
        "handler_used": "order_inquiry",
        "order_id": order_id if order_id != "NONE" else None,
        "order_info": order_info,
        "last_updated": datetime.now(timezone.utc).isoformat()
    }


async def handle_account(state: dict) -> dict:
    """
    Handle account-related inquiries.
    """
    logger.info("handle_account_start", conversation_id=state.get("conversation_id"))
    
    customer_id = state.get("customer_id")
    context = ""
    
    if customer_id:
        account_result = get_account_info(customer_id)
        if account_result["found"]:
            account = account_result["account"]
            context = f"""Customer Account Information:
- Name: {account['name']}
- Email: {account['email']}
- Status: {account['status']}
- Loyalty Tier: {account.get('loyalty_tier', 'Standard')}
- Member Since: {account.get('member_since', 'N/A')}"""
        else:
            context = "Customer account not found."
    else:
        context = "No customer ID available. Ask customer to verify their identity."
    
    return {
        "context": context,
        "handler_used": "account",
        "last_updated": datetime.now(timezone.utc).isoformat()
    }


async def handle_complaint(state: dict) -> dict:
    """
    Handle customer complaints with empathy and create a ticket.
    """
    logger.info("handle_complaint_start", conversation_id=state.get("conversation_id"))
    
    messages = state["messages"]
    last_message = messages[-1].content if messages else ""
    customer_id = state.get("customer_id", "UNKNOWN")
    
    # Create a support ticket for the complaint
    ticket_result = create_ticket(
        customer_id=customer_id,
        category="complaint",
        subject="Customer Complaint",
        description=last_message,
        priority="high",
        conversation_id=state.get("conversation_id")
    )
    
    ticket_id = ticket_result["ticket"]["ticket_id"]
    
    context = f"""Complaint ticket created:
- Ticket ID: {ticket_id}
- Priority: High
- Status: Open

Acknowledge the customer's frustration with empathy. Reference the ticket number.
Assure them their concern is being taken seriously."""
    
    return {
        "context": context,
        "handler_used": "complaint",
        "ticket_id": ticket_id,
        "last_updated": datetime.now(timezone.utc).isoformat()
    }


async def handle_general(state: dict) -> dict:
    """
    Handle general inquiries that don't fit other categories.
    """
    logger.info("handle_general_start", conversation_id=state.get("conversation_id"))
    
    # Use knowledge base for general context
    messages = state["messages"]
    last_message = messages[-1].content if messages else ""
    
    retriever = get_retriever()
    docs = retriever.retrieve(last_message)
    
    context = "\n\n".join([doc.page_content for doc in docs]) if docs else ""
    
    return {
        "context": context,
        "handler_used": "general",
        "last_updated": datetime.now(timezone.utc).isoformat()
    }


# ════════════════════════════════════════════════════════════════════════════
# Response Generation Node
# ════════════════════════════════════════════════════════════════════════════

async def respond(state: dict) -> dict:
    """
    Generate the final response to the customer.
    
    Uses all gathered context to craft a helpful, empathetic response.
    """
    logger.info("respond_start", conversation_id=state.get("conversation_id"))
    
    messages = state["messages"]
    context = state.get("context", "")
    intent = state.get("intent", "general")
    handler_used = state.get("handler_used", "general")
    
    # Build conversation history for context
    conversation_history = "\n".join([
        f"{'Customer' if isinstance(m, HumanMessage) else 'Agent'}: {m.content}"
        for m in messages[-5:]  # Last 5 messages for context
    ])
    
    llm = ChatOpenAI(
        model=settings.default_model,
        api_key=settings.openai_api_key,
        temperature=0.7  # Slightly creative for natural responses
    )
    
    system_prompt = """You are CASPAR, a friendly and helpful customer service assistant for TechFlow Solutions.

Your personality:
- Warm, professional, and empathetic
- Clear and concise in explanations
- Always helpful and solution-oriented
- Acknowledge customer feelings when appropriate

Guidelines:
- If you have specific information from the context, use it
- If you don't have enough information, ask clarifying questions
- Never make up information about orders, accounts, or policies
- For complaints, acknowledge feelings first, then offer solutions
- Keep responses conversational, not robotic"""

    user_prompt = f"""Intent: {intent}
Handler: {handler_used}

Context/Information gathered:
{context if context else "No specific context available."}

Recent conversation:
{conversation_history}

Generate a helpful response to the customer's last message. Be natural and conversational."""

    response = llm.invoke([
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ])
    
    ai_response = response.content
    
    logger.info("respond_complete", response_length=len(ai_response))
    
    return {
        "messages": [AIMessage(content=ai_response)],
        "pending_response": ai_response,  # For approval workflow
        "last_updated": datetime.now(timezone.utc).isoformat()
    }


### agent/graph.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.3
# File: src/caspar/agent/graph.py

"""
CASPAR Agent Graph - The workflow that connects all components.

This module defines the StateGraph that orchestrates the agent's behavior,
routing messages through classification, handling, and response generation.
"""

from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver

from caspar.config import get_logger
from .state import AgentState
from .nodes import (
    classify_intent,
    handle_faq,
    handle_order_inquiry,
    handle_account,
    handle_complaint,
    handle_general,
    respond,
)
from .nodes_handoff_update import check_sentiment, human_handoff

logger = get_logger(__name__)


# ════════════════════════════════════════════════════════════════════════════
# Routing Functions
# ════════════════════════════════════════════════════════════════════════════

def route_by_intent(state: AgentState) -> str:
    """Route to the appropriate handler based on classified intent."""
    intent = state.get("intent", "general")
    
    routes = {
        "faq": "handle_faq",
        "order_inquiry": "handle_order_inquiry",
        "account": "handle_account",
        "complaint": "handle_complaint",
        "handoff_request": "human_handoff",
        "general": "handle_general",
    }
    
    return routes.get(intent, "handle_general")


def route_after_sentiment(state: AgentState) -> str:
    """Route based on sentiment analysis - escalate if needed."""
    if state.get("needs_escalation") and state.get("intent") != "handoff_request":
        return "human_handoff"
    return "respond"


# ════════════════════════════════════════════════════════════════════════════
# Graph Builder
# ════════════════════════════════════════════════════════════════════════════

def build_graph() -> StateGraph:
    """
    Build the CASPAR agent graph.
    
    The flow is:
    1. classify_intent: Determine what the customer needs
    2. handle_*: Process the specific type of request
    3. check_sentiment: Analyze customer emotion
    4. respond OR human_handoff: Generate response or escalate
    
    Returns:
        StateGraph: The uncompiled graph (call .compile() to use)
    """
    graph = StateGraph(AgentState)
    
    # Add all nodes
    graph.add_node("classify_intent", classify_intent)
    graph.add_node("handle_faq", handle_faq)
    graph.add_node("handle_order_inquiry", handle_order_inquiry)
    graph.add_node("handle_account", handle_account)
    graph.add_node("handle_complaint", handle_complaint)
    graph.add_node("handle_general", handle_general)
    graph.add_node("check_sentiment", check_sentiment)
    graph.add_node("respond", respond)
    graph.add_node("human_handoff", human_handoff)
    
    # Set entry point
    graph.set_entry_point("classify_intent")
    
    # Route by intent after classification
    graph.add_conditional_edges(
        "classify_intent",
        route_by_intent,
        {
            "handle_faq": "handle_faq",
            "handle_order_inquiry": "handle_order_inquiry",
            "handle_account": "handle_account",
            "handle_complaint": "handle_complaint",
            "handle_general": "handle_general",
            "human_handoff": "human_handoff",
        }
    )
    
    # All handlers go to sentiment check
    for handler in ["handle_faq", "handle_order_inquiry", "handle_account", 
                    "handle_complaint", "handle_general"]:
        graph.add_edge(handler, "check_sentiment")
    
    # Sentiment check routes to respond or escalate
    graph.add_conditional_edges(
        "check_sentiment",
        route_after_sentiment,
        {
            "respond": "respond",
            "human_handoff": "human_handoff"
        }
    )
    
    # End nodes
    graph.add_edge("respond", END)
    graph.add_edge("human_handoff", END)
    
    return graph


async def create_agent(checkpointer=None):
    """
    Create a compiled CASPAR agent ready for use.
    
    Args:
        checkpointer: Optional checkpointer for persistence.
                     If None, uses in-memory storage.
    
    Returns:
        Compiled graph ready to process messages.
    """
    graph = build_graph()
    
    if checkpointer is None:
        checkpointer = MemorySaver()
    
    return graph.compile(checkpointer=checkpointer)


# ════════════════════════════════════════════════════════════════════════════
# HITL (Human-in-the-Loop) Extensions
# ════════════════════════════════════════════════════════════════════════════
# These are optional extensions for workflows requiring human approval

from langgraph.types import interrupt, Command
from caspar.handoff.approval import needs_approval, get_approval_reason


async def check_approval_needed(state: AgentState) -> dict:
    """
    Check if the pending response needs human approval.
    
    If approval is needed, interrupts the graph and waits for human decision.
    """
    if not needs_approval(state):
        return {"approval_status": "not_required"}
    
    pending_response = state.get("pending_response", "")
    reason = get_approval_reason(state)
    
    logger.info(
        "approval_required",
        conversation_id=state.get("conversation_id"),
        reason=reason
    )
    
    # Interrupt and wait for human decision
    human_decision = interrupt({
        "type": "approval_required",
        "pending_response": pending_response,
        "reason": reason,
        "conversation_id": state.get("conversation_id"),
        "customer_id": state.get("customer_id"),
    })
    
    if human_decision.get("approved"):
        final_response = human_decision.get("edited_response") or pending_response
        return {
            "approval_status": "approved",
            "pending_response": final_response,
            "reviewed_by": human_decision.get("reviewer_id"),
        }
    else:
        return {
            "approval_status": "rejected",
            "reviewed_by": human_decision.get("reviewer_id"),
            "needs_escalation": True,
        }


def route_after_approval(state: AgentState) -> str:
    """Route after approval check."""
    status = state.get("approval_status", "not_required")
    if status in ["not_required", "approved"]:
        return "send_response"
    return END


async def send_response(state: AgentState) -> dict:
    """Send the final response to the customer."""
    logger.info("send_response", conversation_id=state.get("conversation_id"))
    return {"response_sent": True}


def build_graph_with_approval() -> StateGraph:
    """
    Build agent graph with human approval workflow.
    
    Extends the standard graph to add approval checks for high-stakes responses.
    """
    graph = StateGraph(AgentState)
    
    # Add all standard nodes
    graph.add_node("classify_intent", classify_intent)
    graph.add_node("handle_faq", handle_faq)
    graph.add_node("handle_order_inquiry", handle_order_inquiry)
    graph.add_node("handle_account", handle_account)
    graph.add_node("handle_complaint", handle_complaint)
    graph.add_node("handle_general", handle_general)
    graph.add_node("check_sentiment", check_sentiment)
    graph.add_node("respond", respond)
    graph.add_node("human_handoff", human_handoff)
    
    # Add approval nodes
    graph.add_node("check_approval_needed", check_approval_needed)
    graph.add_node("send_response", send_response)
    
    # Set entry point
    graph.set_entry_point("classify_intent")
    
    # Route by intent
    graph.add_conditional_edges(
        "classify_intent",
        route_by_intent,
        {
            "handle_faq": "handle_faq",
            "handle_order_inquiry": "handle_order_inquiry",
            "handle_account": "handle_account",
            "handle_complaint": "handle_complaint",
            "handle_general": "handle_general",
            "human_handoff": "human_handoff",
        }
    )
    
    # Handlers -> sentiment check
    for handler in ["handle_faq", "handle_order_inquiry", "handle_account", 
                    "handle_complaint", "handle_general"]:
        graph.add_edge(handler, "check_sentiment")
    
    # Sentiment -> respond or escalate
    graph.add_conditional_edges(
        "check_sentiment",
        route_after_sentiment,
        {"respond": "respond", "human_handoff": "human_handoff"}
    )
    
    # Respond -> approval check
    graph.add_edge("respond", "check_approval_needed")
    
    # Approval -> send or end
    graph.add_conditional_edges(
        "check_approval_needed",
        route_after_approval,
        {"send_response": "send_response", END: END}
    )
    
    # End nodes
    graph.add_edge("send_response", END)
    graph.add_edge("human_handoff", END)
    
    return graph


async def create_agent_with_approval(checkpointer=None):
    """
    Create agent with human approval support.
    
    IMPORTANT: A checkpointer is REQUIRED for interrupts to work!
    """
    graph = build_graph_with_approval()
    
    if checkpointer is None:
        raise ValueError(
            "Checkpointer is required for interrupt support. "
            "The graph must persist state to resume after approval."
        )
    
    return graph.compile(checkpointer=checkpointer)


### agent/persistence.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.2
# File: src/caspar/agent/persistence.py

"""
Conversation persistence using PostgreSQL.

This module provides checkpointing functionality that allows
conversations to survive restarts and be resumed later.

IMPORTANT: AsyncPostgresSaver must be used as an async context manager.
The checkpointer should stay open for the lifetime of your application.
"""

import os
from contextlib import asynccontextmanager
from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver

from caspar.config import get_logger

logger = get_logger(__name__)


@asynccontextmanager
async def create_checkpointer_context():
    """
    Create a PostgreSQL checkpointer as an async context manager.
    
    This must be used with 'async with' and should wrap your entire
    application lifecycle (e.g., in FastAPI's lifespan).
    
    Yields:
        AsyncPostgresSaver if database is configured, None otherwise
    
    Environment Variables:
        DATABASE_URL: PostgreSQL connection string
                     Format: postgresql://user:pass@host:port/dbname
    
    Usage:
        async with create_checkpointer_context() as checkpointer:
            agent = await create_agent(checkpointer=checkpointer)
            # ... run your application ...
            # checkpointer stays open until you exit the 'async with'
    """
    database_url = os.getenv("DATABASE_URL")
    
    if not database_url:
        logger.warning(
            "no_database_url",
            message="DATABASE_URL not set - conversations won't persist across restarts"
        )
        yield None
        return
    
    try:
        # AsyncPostgresSaver MUST be used as async context manager
        async with AsyncPostgresSaver.from_conn_string(database_url) as checkpointer:
            # Set up the required tables (safe to call multiple times)
            await checkpointer.setup()
            
            logger.info("checkpointer_initialized", database="postgresql")
            
            # Yield the checkpointer - it stays open until we exit
            yield checkpointer
            
            # Cleanup happens automatically when we exit the 'async with'
            logger.info("checkpointer_closing")
        
    except Exception as e:
        logger.error(
            "checkpointer_failed",
            error=str(e),
            message="Falling back to in-memory state (no persistence)"
        )
        yield None


---
## Section 20.3: Implementing knowledge retrieval

This section implements RAG (Retrieval-Augmented Generation) for answering customer questions.

**Key files:**
- `data/knowledge_base/*.md` - Knowledge base content (FAQs, policies, products)
- `src/caspar/knowledge/loader.py` - Document loading
- `src/caspar/knowledge/retriever.py` - Vector search and RAG
- `scripts/build_knowledge_base.py` - KB indexing script

### data/knowledge_base/faq.md

```markdown

In [None]:
# Content of data/knowledge_base/faq.md
# (Rendered below as reference)

'''
# Frequently Asked Questions

## Orders & Shipping

### How do I track my order?
You can track your order by:
1. Logging into your TechFlow account
2. Going to "Order History"
3. Clicking on your order number
4. Clicking "Track Package"

You'll also receive tracking emails when your order ships.

### When will my order arrive?
Delivery times depend on your shipping method:
- Standard: 5-7 business days
- Express: 2-3 business days
- Overnight: Next business day

Orders placed before 2 PM EST ship the same day.

### Can I change my shipping address after ordering?
You can change your shipping address only if the order hasn't shipped yet. Contact customer service immediately, and we'll do our best to update it. Once shipped, the address cannot be changed.

### Do you ship internationally?
Currently, we only ship within the continental United States. We do not ship to Alaska, Hawaii, or international destinations at this time.

## Returns & Refunds

### How do I return an item?
To return an item:
1. Log into your TechFlow account
2. Go to "Order History"
3. Find the order and click "Return Item"
4. Select the items you want to return and your reason
5. Print the prepaid shipping label
6. Drop off at any UPS location

### How long do refunds take?
Once we receive your return:
- Inspection takes 1-2 business days
- Refund is processed within 5-7 business days
- Depending on your bank, it may take an additional 3-5 days to appear

### Can I return an opened product?
Most opened products can be returned if they're in like-new condition. Exceptions include:
- Downloaded software
- Opened headphones/earbuds (hygiene reasons)
- Personalized or customized items

### What if my item arrived damaged?
Contact us immediately! We'll either:
- Send a replacement at no cost
- Issue a full refund

Keep the damaged packaging for carrier claims.

## Products & Technical

### How do I find the right laptop for me?
Consider these factors:
- **Students/Basic use**: TechFlow Air 13 ($899) - lightweight, great battery
- **Professionals**: TechFlow Pro 15 ($1,299) - powerful, beautiful display
- **Gamers**: TechFlow Gaming X17 ($1,799) - high performance, dedicated GPU

### Are your refurbished products reliable?
Yes! Our refurbished products:
- Are thoroughly tested and certified
- Look and function like new
- Come with a 6-month TechFlow warranty
- Include all original accessories
- Are clearly marked and discounted 15-30%

### How do I check warranty status?
To check your warranty:
1. Log into your TechFlow account
2. Go to "My Products"
3. Click on the product
4. View warranty status and expiration date

Or contact customer service with your order number.

## Account & Payment

### How do I reset my password?
1. Go to techflow.com/login
2. Click "Forgot Password"
3. Enter your email address
4. Check your email for a reset link
5. Create a new password

### What payment methods do you accept?
We accept:
- Credit cards (Visa, Mastercard, Amex, Discover)
- PayPal
- Apple Pay / Google Pay
- TechFlow Gift Cards
- Affirm financing (orders over $150)

### Is my payment information secure?
Absolutely. We use:
- 256-bit SSL encryption
- PCI DSS compliance
- No storage of full credit card numbers
- Fraud detection systems

### How does Affirm financing work?
Affirm lets you split purchases over $150 into monthly payments:
1. Select Affirm at checkout
2. Enter basic information
3. Get approved in seconds
4. Choose 3, 6, or 12 month terms
5. Pay over time with no hidden fees

'''

### data/knowledge_base/policies.md

```markdown

In [None]:
# Content of data/knowledge_base/policies.md
# (Rendered below as reference)

'''
# TechFlow Company Policies

## Return Policy

TechFlow offers a 30-day return policy for most items. Here are the details:

- **Timeframe**: Returns must be initiated within 30 days of delivery
- **Condition**: Items must be in original packaging, unused, with all accessories
- **Exceptions**: Software, opened headphones, and personalized items cannot be returned
- **Process**: Start a return from your account page or contact customer service
- **Refund timing**: Refunds are processed within 5-7 business days after we receive the item

For defective items, we offer exchanges or full refunds regardless of the return window.

## Shipping Policy

We offer three shipping options for continental US orders:

- **Standard Shipping**: 5-7 business days, FREE on orders over $50, otherwise $5.99
- **Express Shipping**: 2-3 business days, $12.99
- **Overnight Shipping**: Next business day, $24.99

Orders placed before 2 PM EST ship the same day. We do not currently ship to Alaska, Hawaii, or international destinations.

## Warranty Policy

All TechFlow products come with manufacturer warranties:

- **Laptops and Computers**: 1-year manufacturer warranty
- **Phones and Tablets**: 1-year manufacturer warranty  
- **Accessories**: 90-day warranty
- **Refurbished Items**: 6-month TechFlow warranty

Extended warranty plans are available for purchase within 30 days of your original order.

## Price Match Policy

We match prices from major retailers including Amazon, Best Buy, and Walmart:

- Item must be identical (same model, color, condition)
- Competitor must have item in stock
- We don't match marketplace sellers, clearance, or doorbusters
- Request must be made within 14 days of purchase

## Payment Methods

We accept:
- All major credit cards (Visa, Mastercard, American Express, Discover)
- PayPal
- Apple Pay and Google Pay
- TechFlow Gift Cards
- Affirm financing (for orders over $150)

'''

### data/knowledge_base/products.md

```markdown

In [None]:
# Content of data/knowledge_base/products.md
# (Rendered below as reference)

'''
# TechFlow Product Catalog

## Laptops

### TechFlow Pro 15
- **Price**: $1,299
- **Display**: 15.6" 4K OLED
- **Processor**: Intel Core i7-13700H
- **RAM**: 16GB DDR5
- **Storage**: 512GB NVMe SSD
- **Battery**: Up to 10 hours
- **Weight**: 4.2 lbs
- **Best for**: Professional work, content creation, programming

### TechFlow Air 13
- **Price**: $899
- **Display**: 13.3" FHD IPS
- **Processor**: Intel Core i5-1335U
- **RAM**: 8GB DDR4
- **Storage**: 256GB NVMe SSD
- **Battery**: Up to 12 hours
- **Weight**: 2.8 lbs
- **Best for**: Students, everyday use, travel

### TechFlow Gaming X17
- **Price**: $1,799
- **Display**: 17.3" QHD 165Hz
- **Processor**: Intel Core i9-13900HX
- **RAM**: 32GB DDR5
- **Graphics**: NVIDIA RTX 4070
- **Storage**: 1TB NVMe SSD
- **Best for**: Gaming, 3D rendering, video editing

## Phones

### TechFlow Phone 12
- **Price**: $799
- **Display**: 6.5" AMOLED 120Hz
- **Processor**: Snapdragon 8 Gen 2
- **RAM**: 8GB
- **Storage**: 128GB / 256GB options
- **Camera**: 50MP main + 12MP ultrawide + 10MP telephoto
- **Battery**: 4,500mAh with 65W fast charging
- **Colors**: Midnight Black, Ocean Blue, Forest Green

### TechFlow Phone 12 Pro
- **Price**: $1,099
- **Display**: 6.7" AMOLED 120Hz
- **Processor**: Snapdragon 8 Gen 2
- **RAM**: 12GB
- **Storage**: 256GB / 512GB options
- **Camera**: 108MP main + 12MP ultrawide + 10MP telephoto + 2MP macro
- **Battery**: 5,000mAh with 100W fast charging
- **Colors**: Titanium Gray, Pearl White, Rose Gold

## Tablets

### TechFlow Tab 10
- **Price**: $449
- **Display**: 10.5" LCD
- **Processor**: Snapdragon 870
- **RAM**: 6GB
- **Storage**: 128GB (expandable via microSD)
- **Battery**: Up to 14 hours
- **Best for**: Entertainment, reading, light productivity

### TechFlow Tab Pro 12
- **Price**: $799
- **Display**: 12.4" AMOLED
- **Processor**: Snapdragon 8 Gen 1
- **RAM**: 8GB
- **Storage**: 256GB
- **Stylus**: TechFlow Pen included
- **Best for**: Artists, professionals, note-taking

## Accessories

### TechFlow Wireless Earbuds
- **Price**: $129
- **Battery**: 6 hours (24 hours with case)
- **Features**: Active noise cancellation, transparency mode, water resistant

### TechFlow USB-C Hub
- **Price**: $79
- **Ports**: 2x USB-A, 1x USB-C, HDMI, SD card reader, Ethernet

### TechFlow Laptop Stand
- **Price**: $49
- **Material**: Aluminum
- **Features**: Adjustable height, foldable for travel

'''

### data/knowledge_base/troubleshooting.md

```markdown

In [None]:
# Content of data/knowledge_base/troubleshooting.md
# (Rendered below as reference)

'''
# Troubleshooting Guide

## Laptop Issues

### My laptop won't turn on
Try these steps in order:
1. Make sure it's plugged in and the charging light is on
2. Hold the power button for 15 seconds
3. Disconnect all peripherals and try again
4. Try a different power outlet
5. If still not working, contact support for warranty service

### My laptop is running slowly
Common fixes:
1. Restart your laptop (fixes most issues!)
2. Check for Windows/macOS updates
3. Close unused browser tabs and programs
4. Run a disk cleanup to free up space
5. Check Task Manager for resource-heavy programs
6. Consider adding more RAM if consistently slow

### My laptop battery drains quickly
To improve battery life:
1. Lower screen brightness
2. Turn off Bluetooth and WiFi when not needed
3. Close background applications
4. Use "Battery Saver" mode
5. Check for battery health in settings
6. Normal capacity is 80%+ after 2 years - below that may need replacement

## Phone Issues

### My phone won't charge
Try these solutions:
1. Try a different cable and adapter
2. Clean the charging port gently with a toothpick
3. Check for debris or lint in the port
4. Try wireless charging if available
5. Restart the phone
6. If still not charging, contact support

### My phone screen is frozen
To unfreeze:
1. Try a force restart: hold Power + Volume Down for 10 seconds
2. Wait for the phone to restart
3. If it happens frequently, check for app updates
4. Consider a factory reset as last resort (backup first!)

### My phone's battery drains quickly
To extend battery life:
1. Check battery usage in Settings to find power-hungry apps
2. Reduce screen brightness
3. Turn off location services for apps that don't need it
4. Disable background app refresh for non-essential apps
5. Turn on battery saver mode

## Audio/Earbuds Issues

### My earbuds won't connect
Reset and reconnect:
1. Put earbuds in the case and close it
2. Wait 30 seconds
3. Open Settings > Bluetooth on your device
4. "Forget" the TechFlow Earbuds
5. Open the case, hold the button until light flashes
6. Select TechFlow Earbuds in Bluetooth settings

### One earbud is quieter than the other
Try these fixes:
1. Clean both earbuds with a dry cloth
2. Check ear tips for wax buildup
3. Reset the earbuds (see above)
4. Check audio balance in phone settings
5. If persists, may be a hardware issue - contact support

## General

### I forgot my account password
1. Go to techflow.com/login
2. Click "Forgot Password"
3. Enter your email
4. Check your inbox (and spam folder)
5. Click the reset link
6. Create a new password

### How do I contact support?
You can reach us through:
- This chat (available 24/7)
- Email: support@techflow.com (response within 24 hours)
- Phone: 1-800-TECHFLOW (Mon-Fri, 9 AM - 6 PM EST)
- Twitter/X: @TechFlowSupport

'''

### knowledge/loader.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.3
# File: src/caspar/knowledge/loader.py

"""
Knowledge Base Loader

Loads and processes knowledge base documents from markdown files,
splits them into chunks, and prepares them for embedding.
"""

from pathlib import Path
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter

from caspar.config import get_logger

logger = get_logger(__name__)


class KnowledgeLoader:
    """
    Loads knowledge base content from markdown files.
    
    The loader reads all .md files from the knowledge base directory,
    splits them into manageable chunks, and prepares them for embedding.
    """
    
    def __init__(
        self,
        knowledge_dir: str = "data/knowledge_base",
        chunk_size: int = 500,
        chunk_overlap: int = 50
    ):
        """
        Initialize the knowledge loader.
        
        Args:
            knowledge_dir: Path to directory containing .md files
            chunk_size: Maximum size of each text chunk
            chunk_overlap: Overlap between chunks to preserve context
        """
        self.knowledge_dir = Path(knowledge_dir)
        self.chunk_size = chunk_size
        self.chunk_overlap = chunk_overlap
        
        self.text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=chunk_size,
            chunk_overlap=chunk_overlap,
            length_function=len,
            separators=["\n## ", "\n### ", "\n\n", "\n", " ", ""]
        )
    
    def load_documents(self) -> list[Document]:
        """
        Load all markdown files from the knowledge directory.
        
        Returns:
            List of Document objects, each representing a chunk
        """
        if not self.knowledge_dir.exists():
            logger.warning(
                "knowledge_dir_not_found",
                path=str(self.knowledge_dir)
            )
            return []
        
        documents = []
        md_files = list(self.knowledge_dir.glob("*.md"))
        
        logger.info(
            "loading_knowledge_base",
            file_count=len(md_files),
            directory=str(self.knowledge_dir)
        )
        
        for file_path in md_files:
            try:
                content = file_path.read_text(encoding="utf-8")
                
                # Create document with metadata
                doc = Document(
                    page_content=content,
                    metadata={
                        "source": file_path.name,
                        "category": self._extract_category(file_path.name)
                    }
                )
                documents.append(doc)
                
                logger.debug(
                    "loaded_file",
                    file=file_path.name,
                    size=len(content)
                )
                
            except Exception as e:
                logger.error(
                    "file_load_error",
                    file=file_path.name,
                    error=str(e)
                )
        
        return documents
    
    def load_and_split(self) -> list[Document]:
        """
        Load documents and split them into chunks.
        
        Returns:
            List of chunked Document objects
        """
        documents = self.load_documents()
        
        if not documents:
            return []
        
        chunks = self.text_splitter.split_documents(documents)
        
        logger.info(
            "documents_chunked",
            original_docs=len(documents),
            chunks=len(chunks),
            avg_chunk_size=sum(len(c.page_content) for c in chunks) // len(chunks)
        )
        
        return chunks
    
    def _extract_category(self, filename: str) -> str:
        """Extract category from filename for filtering."""
        # Remove .md extension and use as category
        name = filename.replace(".md", "").lower()
        
        category_map = {
            "policies": "policy",
            "products": "product",
            "faq": "faq",
            "troubleshooting": "troubleshooting"
        }
        
        return category_map.get(name, "general")


### knowledge/retriever.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.3
# File: src/caspar/knowledge/retriever.py

"""
Knowledge Base Retriever

Handles embedding, storage, and retrieval of knowledge base content
using ChromaDB for vector similarity search.
"""

from pathlib import Path
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_core.documents import Document

from caspar.config import settings, get_logger
from .loader import KnowledgeLoader

logger = get_logger(__name__)


class KnowledgeRetriever:
    """
    Retrieves relevant knowledge for customer queries.
    
    Uses ChromaDB for vector storage and OpenAI embeddings for
    semantic similarity search.
    """
    
    def __init__(
        self,
        persist_directory: str | None = None,
        collection_name: str = "techflow_knowledge"
    ):
        """
        Initialize the knowledge retriever.
        
        Args:
            persist_directory: Where to store ChromaDB data (None for in-memory)
            collection_name: Name of the ChromaDB collection
        """
        self.persist_directory = persist_directory or settings.chroma_persist_directory
        self.collection_name = collection_name
        
        # Initialize embeddings
        self.embeddings = OpenAIEmbeddings(
            api_key=settings.openai_api_key,
            model="text-embedding-3-small"  # Fast and cost-effective
        )
        
        self.vectorstore: Chroma | None = None
        self._initialized = False
    
    def initialize(self, force_reload: bool = False) -> None:
        """
        Initialize the vector store, loading documents if needed.
        
        Args:
            force_reload: If True, reload documents even if store exists
        """
        persist_path = Path(self.persist_directory)
        
        # Check if we already have a persisted store
        if persist_path.exists() and not force_reload:
            logger.info(
                "loading_existing_vectorstore",
                path=str(persist_path)
            )
            self.vectorstore = Chroma(
                persist_directory=str(persist_path),
                collection_name=self.collection_name,
                embedding_function=self.embeddings
            )
            self._initialized = True
            
            # Log collection stats
            collection = self.vectorstore._collection
            count = collection.count()
            logger.info("vectorstore_loaded", document_count=count)
            return
        
        # Load and embed documents
        logger.info("creating_new_vectorstore")
        
        loader = KnowledgeLoader()
        documents = loader.load_and_split()
        
        if not documents:
            logger.warning("no_documents_to_embed")
            # Create empty store
            self.vectorstore = Chroma(
                persist_directory=str(persist_path),
                collection_name=self.collection_name,
                embedding_function=self.embeddings
            )
            self._initialized = True
            return
        
        # Create vectorstore with documents
        self.vectorstore = Chroma.from_documents(
            documents=documents,
            embedding=self.embeddings,
            persist_directory=str(persist_path),
            collection_name=self.collection_name
        )
        
        self._initialized = True
        logger.info(
            "vectorstore_created",
            document_count=len(documents),
            path=str(persist_path)
        )
    
    def retrieve(
        self,
        query: str,
        k: int | None = None,
        category_filter: str | None = None
    ) -> list[Document]:
        """
        Retrieve relevant documents for a query.
        
        Args:
            query: The search query
            k: Number of documents to retrieve (default from settings)
            category_filter: Optional category to filter by
            
        Returns:
            List of relevant Document objects
        """
        if not self._initialized:
            self.initialize()
        
        if not self.vectorstore:
            logger.warning("vectorstore_not_available")
            return []
        
        k = k or settings.retrieval_k
        
        # Build filter if category specified
        where_filter = None
        if category_filter:
            where_filter = {"category": category_filter}
        
        logger.debug(
            "retrieving_documents",
            query=query[:50],
            k=k,
            filter=category_filter
        )
        
        # Perform similarity search
        if where_filter:
            docs = self.vectorstore.similarity_search(
                query=query,
                k=k,
                filter=where_filter
            )
        else:
            docs = self.vectorstore.similarity_search(
                query=query,
                k=k
            )
        
        logger.info(
            "documents_retrieved",
            query=query[:50],
            count=len(docs)
        )
        
        return docs
    
    def retrieve_with_scores(
        self,
        query: str,
        k: int | None = None
    ) -> list[tuple[Document, float]]:
        """
        Retrieve documents with similarity scores.
        
        Useful for debugging and understanding retrieval quality.
        
        Args:
            query: The search query
            k: Number of documents to retrieve
            
        Returns:
            List of (Document, score) tuples, lower score = more similar
        """
        if not self._initialized:
            self.initialize()
        
        if not self.vectorstore:
            return []
        
        k = k or settings.retrieval_k
        
        results = self.vectorstore.similarity_search_with_score(
            query=query,
            k=k
        )
        
        return results
    
    def format_context(self, documents: list[Document]) -> str:
        """
        Format retrieved documents into a context string for the LLM.
        
        Args:
            documents: List of retrieved documents
            
        Returns:
            Formatted context string
        """
        if not documents:
            return "No relevant information found in knowledge base."
        
        context_parts = []
        
        for i, doc in enumerate(documents, 1):
            source = doc.metadata.get("source", "unknown")
            category = doc.metadata.get("category", "general")
            
            context_parts.append(
                f"[Source {i}: {source} ({category})]\n{doc.page_content}"
            )
        
        return "\n\n---\n\n".join(context_parts)


# Singleton instance for easy access
_retriever_instance: KnowledgeRetriever | None = None


def get_retriever() -> KnowledgeRetriever:
    """Get or create the global knowledge retriever instance."""
    global _retriever_instance
    
    if _retriever_instance is None:
        _retriever_instance = KnowledgeRetriever()
    
    return _retriever_instance


### scripts/build_knowledge_base.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.3
# File: scripts/build_knowledge_base.py

"""
Build and test the CASPAR knowledge base.

This script:
1. Loads all knowledge documents
2. Creates embeddings and stores in ChromaDB
3. Tests retrieval with sample queries

Note: Make sure you've run 'pip install -e .' from the project root first!
"""

import sys
from pathlib import Path

from caspar.config import setup_logging, get_logger
from caspar.knowledge import KnowledgeLoader, KnowledgeRetriever

setup_logging()
logger = get_logger(__name__)


def build_knowledge_base():
    """Build the ChromaDB knowledge base from markdown files."""
    
    print("=" * 60)
    print("📚 Building CASPAR Knowledge Base")
    print("=" * 60)
    
    # Check that knowledge files exist
    kb_path = Path("data/knowledge_base")
    if not kb_path.exists():
        print(f"❌ Knowledge base directory not found: {kb_path}")
        print("   Create the directory and add your .md files")
        return False
    
    md_files = list(kb_path.glob("*.md"))
    print(f"\n📄 Found {len(md_files)} markdown files:")
    for f in md_files:
        size = f.stat().st_size / 1024
        print(f"   • {f.name} ({size:.1f} KB)")
    
    if not md_files:
        print("❌ No .md files found in knowledge base directory")
        return False
    
    # Load and preview documents
    print("\n📖 Loading documents...")
    loader = KnowledgeLoader()
    chunks = loader.load_and_split()
    
    print(f"✅ Created {len(chunks)} chunks")
    print(f"   Average chunk size: {sum(len(c.page_content) for c in chunks) // len(chunks)} characters")
    
    # Build vector store
    print("\n🔨 Building vector store...")
    retriever = KnowledgeRetriever()
    retriever.initialize(force_reload=True)
    
    print("✅ Vector store created and persisted")
    
    return True


def test_retrieval():
    """Test retrieval with sample queries."""
    
    print("\n" + "=" * 60)
    print("🧪 Testing Knowledge Retrieval")
    print("=" * 60)
    
    retriever = KnowledgeRetriever()
    retriever.initialize()
    
    test_queries = [
        "What is your return policy?",
        "How do I track my order?",
        "My laptop won't turn on, what should I do?",
        "What laptops do you sell?",
        "How long does shipping take?",
        "Can I pay with PayPal?",
        "My earbuds won't connect to my phone",
    ]
    
    for query in test_queries:
        print(f"\n📝 Query: {query}")
        print("-" * 50)
        
        results = retriever.retrieve_with_scores(query, k=2)
        
        for doc, score in results:
            source = doc.metadata.get("source", "unknown")
            preview = doc.page_content[:100].replace("\n", " ")
            print(f"   📄 [{source}] (score: {score:.3f})")
            print(f"      {preview}...")
    
    print("\n✅ Retrieval tests complete!")


def interactive_test():
    """Interactive mode for testing queries."""
    
    print("\n" + "=" * 60)
    print("🔍 Interactive Knowledge Search")
    print("=" * 60)
    print("Type your questions to test retrieval. Type 'quit' to exit.\n")
    
    retriever = KnowledgeRetriever()
    retriever.initialize()
    
    while True:
        query = input("Your question: ").strip()
        
        if query.lower() == "quit":
            break
        
        if not query:
            continue
        
        results = retriever.retrieve_with_scores(query, k=3)
        
        print(f"\n📚 Top {len(results)} results:\n")
        
        for i, (doc, score) in enumerate(results, 1):
            source = doc.metadata.get("source", "unknown")
            category = doc.metadata.get("category", "general")
            print(f"Result {i} [{source} - {category}] (score: {score:.3f}):")
            print("-" * 40)
            print(doc.page_content[:300])
            print("..." if len(doc.page_content) > 300 else "")
            print()


def main():
    """Run all knowledge base operations."""
    
    import argparse
    
    parser = argparse.ArgumentParser(description="Build and test CASPAR knowledge base")
    parser.add_argument("--build", action="store_true", help="Build the vector store")
    parser.add_argument("--test", action="store_true", help="Run retrieval tests")
    parser.add_argument("--interactive", action="store_true", help="Interactive query mode")
    
    args = parser.parse_args()
    
    # Default to build + test if no args
    if not any([args.build, args.test, args.interactive]):
        args.build = True
        args.test = True
    
    if args.build:
        success = build_knowledge_base()
        if not success:
            sys.exit(1)
    
    if args.test:
        test_retrieval()
    
    if args.interactive:
        interactive_test()
    
    print("\n🎉 Done!")


if __name__ == "__main__":
    main()


---
## Section 20.4: Building the conversation flow

This section implements the tools that CASPAR uses to help customers.

**Key files:**
- `src/caspar/tools/orders.py` - Order lookup and tracking
- `src/caspar/tools/tickets.py` - Support ticket creation
- `src/caspar/tools/accounts.py` - Account information
- `scripts/test_conversation_flow.py` - Conversation testing

### tools/orders.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.4
# File: src/caspar/tools/orders.py

"""
Order Lookup Tool

Provides order status and tracking information.
In production, this would connect to your order management system.
"""

from datetime import datetime, timedelta, timezone
from typing import Literal
from pydantic import BaseModel, Field
import random

from caspar.config import get_logger

logger = get_logger(__name__)


class OrderInfo(BaseModel):
    """Information about a customer order."""
    
    order_id: str
    customer_id: str
    status: Literal["processing", "shipped", "delivered", "cancelled", "returned"]
    items: list[dict]
    total: float
    order_date: str
    shipping_method: str
    tracking_number: str | None = None
    estimated_delivery: str | None = None
    delivery_date: str | None = None


class OrderLookupTool:
    """
    Tool for looking up order information.
    
    In production, this would query your order management system.
    For demo purposes, we use mock data.
    """
    
    def __init__(self):
        self._mock_orders = self._generate_mock_orders()
    
    def _generate_mock_orders(self) -> dict[str, OrderInfo]:
        """Generate mock order data for testing."""
        
        products = [
            {"name": "TechFlow Pro 15 Laptop", "price": 1299.00, "quantity": 1},
            {"name": "TechFlow Wireless Earbuds", "price": 129.00, "quantity": 1},
            {"name": "TechFlow USB-C Hub", "price": 79.00, "quantity": 2},
            {"name": "TechFlow Phone 12", "price": 799.00, "quantity": 1},
            {"name": "TechFlow Tab 10", "price": 449.00, "quantity": 1},
        ]
        
        statuses = ["processing", "shipped", "delivered", "shipped", "delivered"]
        shipping_methods = ["standard", "express", "overnight"]
        
        orders = {}
        base_date = datetime.now()
        
        # Generate 20 mock orders
        for i in range(20):
            order_id = f"TF-{10000 + i}"
            customer_id = f"CUST-{1000 + (i % 5)}"
            
            num_items = random.randint(1, 3)
            order_items = random.sample(products, num_items)
            total = sum(item["price"] * item["quantity"] for item in order_items)
            
            status = statuses[i % len(statuses)]
            shipping = shipping_methods[i % len(shipping_methods)]
            order_date = base_date - timedelta(days=random.randint(1, 30))
            
            # Add tracking for shipped/delivered orders
            tracking = None
            estimated_delivery = None
            delivery_date = None
            
            if status in ["shipped", "delivered"]:
                tracking = f"1Z999AA{10000000 + i}"
                est_days = {"standard": 7, "express": 3, "overnight": 1}[shipping]
                estimated_delivery = (order_date + timedelta(days=est_days)).strftime("%Y-%m-%d")
                
                if status == "delivered":
                    delivery_date = estimated_delivery
            
            orders[order_id] = OrderInfo(
                order_id=order_id,
                customer_id=customer_id,
                status=status,
                items=order_items,
                total=total,
                order_date=order_date.strftime("%Y-%m-%d"),
                shipping_method=shipping,
                tracking_number=tracking,
                estimated_delivery=estimated_delivery,
                delivery_date=delivery_date,
            )
        
        return orders
    
    def lookup(self, order_id: str, customer_id: str | None = None) -> OrderInfo | None:
        """
        Look up an order by ID.
        
        Args:
            order_id: The order ID to look up
            customer_id: Optional customer ID for verification
            
        Returns:
            OrderInfo if found, None otherwise
        """
        logger.info("order_lookup", order_id=order_id, customer_id=customer_id)
        
        # Normalize order ID (accept "10001" or "TF-10001")
        order_id = order_id.upper().strip()
        if not order_id.startswith("TF-"):
            order_id = f"TF-{order_id}"
        
        order = self._mock_orders.get(order_id)
        
        if order is None:
            logger.warning("order_not_found", order_id=order_id)
            return None
        
        # Security: verify customer owns this order
        if customer_id and order.customer_id != customer_id:
            logger.warning("order_customer_mismatch", order_id=order_id)
            return None
        
        logger.info("order_found", order_id=order_id, status=order.status)
        return order
    
    def get_tracking_url(self, tracking_number: str) -> str:
        """Generate a tracking URL for a shipment."""
        return f"https://track.techflow.com/{tracking_number}"
    
    def format_order_summary(self, order: OrderInfo) -> str:
        """Format order information for display to customer."""
        
        lines = [
            f"**Order {order.order_id}**",
            f"Status: {order.status.upper()}",
            f"Order Date: {order.order_date}",
            f"Shipping: {order.shipping_method.title()}",
            "",
            "Items:",
        ]
        
        for item in order.items:
            lines.append(f"  • {item['name']} (x{item['quantity']}) - ${item['price']:.2f}")
        
        lines.append(f"\nTotal: ${order.total:.2f}")
        
        if order.tracking_number:
            lines.append(f"\nTracking: {order.tracking_number}")
            lines.append(f"Track at: {self.get_tracking_url(order.tracking_number)}")
        
        if order.status == "shipped" and order.estimated_delivery:
            lines.append(f"Expected Delivery: {order.estimated_delivery}")
        elif order.status == "delivered" and order.delivery_date:
            lines.append(f"Delivered: {order.delivery_date}")
        
        return "\n".join(lines)


# Singleton instance
_order_tool: OrderLookupTool | None = None


def get_order_tool() -> OrderLookupTool:
    """Get or create the order lookup tool instance."""
    global _order_tool
    if _order_tool is None:
        _order_tool = OrderLookupTool()
    return _order_tool


def get_order_status(order_id: str, customer_id: str | None = None) -> dict:
    """
    Convenience function to look up order status.
    
    Returns a dict with order info or error message.
    """
    tool = get_order_tool()
    order = tool.lookup(order_id, customer_id)
    
    if order is None:
        return {
            "found": False,
            "error": f"Order {order_id} not found. Please check the order number and try again."
        }
    
    return {
        "found": True,
        "order": order.model_dump(),
        "summary": tool.format_order_summary(order)
    }


### tools/tickets.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.4
# File: src/caspar/tools/tickets.py

"""
Ticket Creation Tool

Creates and manages customer support tickets.
In production, this would integrate with your ticketing system.
"""

from datetime import datetime, timezone
from typing import Literal
from pydantic import BaseModel
import uuid

from caspar.config import get_logger

logger = get_logger(__name__)


class Ticket(BaseModel):
    """A customer support ticket."""
    
    ticket_id: str
    customer_id: str
    conversation_id: str | None = None
    category: Literal["return", "refund", "technical", "billing", "shipping", "general"]
    priority: Literal["low", "medium", "high", "urgent"]
    subject: str
    description: str
    status: Literal["open", "in_progress", "waiting_customer", "resolved", "closed"] = "open"
    created_at: str
    updated_at: str
    assigned_to: str | None = None
    resolution: str | None = None


class TicketTool:
    """
    Tool for creating and managing support tickets.
    
    In production, this would integrate with Zendesk, Freshdesk, etc.
    """
    
    def __init__(self):
        self._tickets: dict[str, Ticket] = {}
    
    def create(
        self,
        customer_id: str,
        category: str,
        subject: str,
        description: str,
        priority: str = "medium",
        conversation_id: str | None = None,
    ) -> Ticket:
        """Create a new support ticket."""
        
        ticket_id = f"TKT-{uuid.uuid4().hex[:8].upper()}"
        now = datetime.now(timezone.utc).isoformat()
        
        ticket = Ticket(
            ticket_id=ticket_id,
            customer_id=customer_id,
            conversation_id=conversation_id,
            category=category,
            priority=priority,
            subject=subject,
            description=description,
            created_at=now,
            updated_at=now,
        )
        
        self._tickets[ticket_id] = ticket
        
        logger.info(
            "ticket_created",
            ticket_id=ticket_id,
            customer_id=customer_id,
            category=category,
            priority=priority
        )
        
        return ticket
    
    def get(self, ticket_id: str) -> Ticket | None:
        """Retrieve a ticket by ID."""
        return self._tickets.get(ticket_id)
    
    def get_customer_tickets(self, customer_id: str) -> list[Ticket]:
        """Get all tickets for a customer."""
        return [t for t in self._tickets.values() if t.customer_id == customer_id]
    
    def format_ticket_confirmation(self, ticket: Ticket) -> str:
        """Format ticket info for customer confirmation."""
        
        priority_emoji = {"low": "🟢", "medium": "🟡", "high": "🟠", "urgent": "🔴"}
        
        return f"""**Support Ticket Created**

Ticket ID: {ticket.ticket_id}
Category: {ticket.category.title()}
Priority: {priority_emoji.get(ticket.priority, "⚪")} {ticket.priority.title()}
Subject: {ticket.subject}

Our team will review your ticket and respond within:
- Urgent: 2 hours
- High: 4 hours  
- Medium: 24 hours
- Low: 48 hours

You can reference ticket {ticket.ticket_id} in future conversations."""


# Singleton instance
_ticket_tool: TicketTool | None = None


def get_ticket_tool() -> TicketTool:
    """Get or create the ticket tool instance."""
    global _ticket_tool
    if _ticket_tool is None:
        _ticket_tool = TicketTool()
    return _ticket_tool


def create_ticket(
    customer_id: str,
    category: str,
    subject: str,
    description: str,
    priority: str = "medium",
    conversation_id: str | None = None,
) -> dict:
    """Convenience function to create a ticket."""
    
    tool = get_ticket_tool()
    
    # Validate inputs
    valid_categories = ["return", "refund", "technical", "billing", "shipping", "general"]
    if category.lower() not in valid_categories:
        category = "general"
    
    valid_priorities = ["low", "medium", "high", "urgent"]
    if priority.lower() not in valid_priorities:
        priority = "medium"
    
    ticket = tool.create(
        customer_id=customer_id,
        category=category.lower(),
        subject=subject,
        description=description,
        priority=priority.lower(),
        conversation_id=conversation_id,
    )
    
    return {
        "success": True,
        "ticket": ticket.model_dump(),
        "confirmation": tool.format_ticket_confirmation(ticket)
    }


### tools/accounts.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.4
# File: src/caspar/tools/accounts.py

"""
Account Information Tool

Retrieves customer account information.
In production, this would connect to your CRM or user database.
"""

from datetime import datetime, timedelta, timezone
from pydantic import BaseModel
import random

from caspar.config import get_logger

logger = get_logger(__name__)


class CustomerAccount(BaseModel):
    """Customer account information."""
    
    customer_id: str
    email: str
    name: str
    phone: str | None = None
    member_since: str
    loyalty_tier: str  # bronze, silver, gold, platinum
    loyalty_points: int
    total_orders: int
    total_spent: float
    default_shipping_address: dict | None = None
    payment_methods_on_file: int
    email_verified: bool = True
    two_factor_enabled: bool = False


class AccountTool:
    """Tool for retrieving customer account information."""
    
    def __init__(self):
        self._mock_accounts = self._generate_mock_accounts()
    
    def _generate_mock_accounts(self) -> dict[str, CustomerAccount]:
        """Generate mock customer data."""
        
        tiers = ["bronze", "silver", "gold", "platinum"]
        
        mock_customers = [
            ("CUST-1000", "john.doe@email.com", "John Doe"),
            ("CUST-1001", "jane.smith@email.com", "Jane Smith"),
            ("CUST-1002", "bob.wilson@email.com", "Bob Wilson"),
            ("CUST-1003", "alice.jones@email.com", "Alice Jones"),
            ("CUST-1004", "charlie.brown@email.com", "Charlie Brown"),
        ]
        
        accounts = {}
        for i, (cust_id, email, name) in enumerate(mock_customers):
            tier_index = min(i, len(tiers) - 1)
            orders = (i + 1) * 5
            spent = orders * random.uniform(100, 500)
            
            accounts[cust_id] = CustomerAccount(
                customer_id=cust_id,
                email=email,
                name=name,
                phone=f"+1-555-{1000 + i:04d}" if i % 2 == 0 else None,
                member_since=(datetime.now() - timedelta(days=365 * (i + 1))).strftime("%Y-%m-%d"),
                loyalty_tier=tiers[tier_index],
                loyalty_points=int(spent * 10),
                total_orders=orders,
                total_spent=round(spent, 2),
                default_shipping_address={
                    "street": f"{100 + i} Main Street",
                    "city": "Anytown",
                    "state": "CA",
                    "zip": f"9{1000 + i}",
                } if i % 2 == 0 else None,
                payment_methods_on_file=min(i + 1, 3),
                two_factor_enabled=i > 2,
            )
        
        return accounts
    
    def get_account(self, customer_id: str) -> CustomerAccount | None:
        """Retrieve account information by customer ID."""
        logger.info("account_lookup", customer_id=customer_id)
        
        account = self._mock_accounts.get(customer_id)
        
        if account is None:
            logger.warning("account_not_found", customer_id=customer_id)
            return None
        
        logger.info("account_found", customer_id=customer_id, tier=account.loyalty_tier)
        return account
    
    def format_account_summary(self, account: CustomerAccount) -> str:
        """Format account info for display to customer."""
        
        tier_emoji = {"bronze": "🥉", "silver": "🥈", "gold": "🥇", "platinum": "💎"}
        
        lines = [
            f"**Account Summary for {account.name}**",
            "",
            f"Member Since: {account.member_since}",
            f"Loyalty Status: {tier_emoji.get(account.loyalty_tier, '')} {account.loyalty_tier.title()}",
            f"Loyalty Points: {account.loyalty_points:,}",
            "",
            f"Total Orders: {account.total_orders}",
            f"Total Spent: ${account.total_spent:,.2f}",
            "",
            f"Email: {account.email} {'✓ Verified' if account.email_verified else '⚠ Not verified'}",
        ]
        
        if account.phone:
            lines.append(f"Phone: {account.phone}")
        
        if account.default_shipping_address:
            addr = account.default_shipping_address
            lines.append(f"\nDefault Shipping Address:")
            lines.append(f"  {addr['street']}")
            lines.append(f"  {addr['city']}, {addr['state']} {addr['zip']}")
        
        lines.append(f"\nPayment Methods: {account.payment_methods_on_file} on file")
        lines.append(f"Two-Factor Auth: {'✓ Enabled' if account.two_factor_enabled else 'Not enabled'}")
        
        return "\n".join(lines)


# Singleton instance
_account_tool: AccountTool | None = None


def get_account_tool() -> AccountTool:
    """Get or create the account tool instance."""
    global _account_tool
    if _account_tool is None:
        _account_tool = AccountTool()
    return _account_tool


def get_account_info(customer_id: str) -> dict:
    """Convenience function to get account information."""
    tool = get_account_tool()
    account = tool.get_account(customer_id)
    
    if account is None:
        return {
            "found": False,
            "error": f"Account {customer_id} not found."
        }
    
    return {
        "found": True,
        "account": account.model_dump(),
        "summary": tool.format_account_summary(account)
    }


### scripts/test_conversation_flow.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.4
# File: scripts/test_conversation_flow.py

"""Test complete conversation flows through CASPAR."""

import asyncio
from langchain_core.messages import HumanMessage

from caspar.agent import create_agent, create_initial_state
from caspar.config import setup_logging, get_logger

setup_logging()
logger = get_logger(__name__)


async def test_flow(name: str, message: str, customer_id: str = "CUST-1000"):
    """Run a single test flow."""
    print(f"\n{'=' * 60}")
    print(f"🧪 Test: {name}")
    print(f"{'=' * 60}")
    
    agent = await create_agent()
    state = create_initial_state(conversation_id=f"test-{name}", customer_id=customer_id)
    state["messages"] = [HumanMessage(content=message)]
    
    config = {"configurable": {"thread_id": f"test-{name}"}}
    result = await agent.ainvoke(state, config)
    
    print(f"Customer: {message}")
    print(f"Intent: {result['intent']}")
    print(f"Sentiment: {result.get('sentiment_score', 'N/A')}")
    if result.get('ticket_id'):
        print(f"Ticket: {result['ticket_id']}")
    print(f"\nCASPAR: {result['messages'][-1].content}")
    
    return result


async def main():
    """Run all tests."""
    
    # Test FAQ
    await test_flow("FAQ", "What is your return policy?")
    
    # Test Order Inquiry
    await test_flow("Order", "Where is my order TF-10001?")
    
    # Test Account
    await test_flow("Account", "What's my loyalty status?", "CUST-1001")
    
    # Test Complaint
    await test_flow("Complaint", "My laptop arrived damaged! This is unacceptable!")
    
    # Test Handoff
    await test_flow("Handoff", "I want to speak to a human agent please")
    
    print(f"\n{'=' * 60}")
    print("✅ All tests complete!")
    print(f"{'=' * 60}")


if __name__ == "__main__":
    asyncio.run(main())


---
## Section 20.5: Adding human handoff capabilities

This section implements the human escalation system for when the AI can't help.

**Key files:**
- `src/caspar/handoff/triggers.py` - Escalation detection
- `src/caspar/handoff/queue.py` - Handoff queue management
- `src/caspar/handoff/context.py` - Context packaging for handoff
- `src/caspar/handoff/notifications.py` - Agent notifications
- `src/caspar/handoff/approval.py` - Human approval workflows
- `src/caspar/agent/nodes_handoff_update.py` - Sentiment and handoff nodes

### handoff/triggers.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.5
# File: src/caspar/handoff/triggers.py

"""
Escalation Trigger Detection

Identifies situations that require human intervention.
"""

from enum import Enum
from pydantic import BaseModel
from langchain_core.messages import HumanMessage

from caspar.config import settings, get_logger

logger = get_logger(__name__)


class EscalationTrigger(str, Enum):
    """Types of escalation triggers."""
    
    EXPLICIT_REQUEST = "explicit_request"
    HIGH_FRUSTRATION = "high_frustration"
    REPEATED_FAILURES = "repeated_failures"
    POLICY_EXCEPTION = "policy_exception"
    VIP_CUSTOMER = "vip_customer"
    SENSITIVE_TOPIC = "sensitive_topic"
    COMPLEX_ISSUE = "complex_issue"
    MAX_TURNS_REACHED = "max_turns_reached"


class EscalationResult(BaseModel):
    """Result of escalation check."""
    
    should_escalate: bool
    triggers: list[EscalationTrigger]
    priority: str  # "low", "medium", "high", "urgent"
    reason: str


def check_escalation_triggers(
    state: dict,
    customer_tier: str | None = None,
) -> EscalationResult:
    """
    Check all escalation triggers against current state.
    
    Args:
        state: Current agent state
        customer_tier: Customer's loyalty tier (if known)
        
    Returns:
        EscalationResult with triggers found and recommended priority
    """
    triggers = []
    reasons = []
    
    # Check explicit request (already classified as handoff_request)
    if state.get("intent") == "handoff_request":
        triggers.append(EscalationTrigger.EXPLICIT_REQUEST)
        reasons.append("Customer requested human agent")
    
    # Check frustration level (handle None values)
    sentiment = state.get("sentiment_score")
    if sentiment is None:
        sentiment = 0.0
    frustration = state.get("frustration_level") or "low"
    
    if sentiment < settings.sentiment_threshold or frustration == "high":
        triggers.append(EscalationTrigger.HIGH_FRUSTRATION)
        reasons.append(f"High frustration detected (sentiment: {sentiment})")
    
    # Check turn count
    turn_count = state.get("turn_count") or 0
    if turn_count >= settings.max_conversation_turns:
        triggers.append(EscalationTrigger.MAX_TURNS_REACHED)
        reasons.append(f"Conversation exceeded {settings.max_conversation_turns} turns")
    
    # Check for VIP customer
    if customer_tier in ["gold", "platinum"]:
        # VIP customers get faster escalation on any issue
        if state.get("intent") == "complaint" or frustration in ["medium", "high"]:
            triggers.append(EscalationTrigger.VIP_CUSTOMER)
            reasons.append(f"VIP customer ({customer_tier} tier) with issue")
    
    # Check for policy exceptions (would need order info)
    order_info = state.get("order_info") or {}
    if order_info.get("full_order"):
        order_total = order_info["full_order"].get("total", 0)
        if order_total > 500 and state.get("intent") == "complaint":
            triggers.append(EscalationTrigger.POLICY_EXCEPTION)
            reasons.append(f"High-value order (${order_total}) with complaint")
    
    # Determine priority based on triggers
    priority = _calculate_priority(triggers)
    
    result = EscalationResult(
        should_escalate=len(triggers) > 0,
        triggers=triggers,
        priority=priority,
        reason="; ".join(reasons) if reasons else "No escalation needed"
    )
    
    if result.should_escalate:
        logger.info(
            "escalation_triggers_detected",
            triggers=[t.value for t in triggers],
            priority=priority
        )
    
    return result


def _calculate_priority(triggers: list[EscalationTrigger]) -> str:
    """Calculate escalation priority based on triggers."""
    
    if not triggers:
        return "low"
    
    # Urgent triggers
    urgent_triggers = {
        EscalationTrigger.EXPLICIT_REQUEST,
        EscalationTrigger.HIGH_FRUSTRATION,
        EscalationTrigger.SENSITIVE_TOPIC,
    }
    
    # High priority triggers
    high_triggers = {
        EscalationTrigger.VIP_CUSTOMER,
        EscalationTrigger.POLICY_EXCEPTION,
        EscalationTrigger.REPEATED_FAILURES,
    }
    
    if any(t in urgent_triggers for t in triggers):
        return "urgent"
    elif any(t in high_triggers for t in triggers):
        return "high"
    elif len(triggers) >= 2:
        # Multiple medium triggers escalate to high
        return "high"
    else:
        return "medium"


def check_sensitive_topics(message: str) -> bool:
    """Check if message contains sensitive topics requiring human handling."""
    
    sensitive_keywords = [
        "lawyer", "lawsuit", "legal action", "sue",
        "police", "fraud", "scam", "stolen",
        "safety", "dangerous", "injury", "injured", "hurt",
        "discrimination", "harassment",
        "cancel account", "delete my data", "gdpr",
    ]
    
    message_lower = message.lower()
    return any(keyword in message_lower for keyword in sensitive_keywords)


### handoff/queue.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.5
# File: src/caspar/handoff/queue.py

"""
Handoff Queue Management

Manages the queue of conversations waiting for human agents.
"""

from datetime import datetime, timezone
from enum import Enum
from typing import Literal
from pydantic import BaseModel, Field
import uuid

from caspar.config import get_logger

logger = get_logger(__name__)


class HandoffStatus(str, Enum):
    """Status of a handoff request."""
    
    QUEUED = "queued"
    ASSIGNED = "assigned"
    IN_PROGRESS = "in_progress"
    RESOLVED = "resolved"
    ABANDONED = "abandoned"


class HandoffRequest(BaseModel):
    """A request for human agent assistance."""
    
    request_id: str = Field(default_factory=lambda: f"HO-{uuid.uuid4().hex[:8].upper()}")
    conversation_id: str
    customer_id: str
    ticket_id: str | None = None
    
    priority: Literal["low", "medium", "high", "urgent"]
    triggers: list[str]  # EscalationTrigger values
    reason: str
    
    status: HandoffStatus = HandoffStatus.QUEUED
    assigned_agent: str | None = None
    
    created_at: str = Field(default_factory=lambda: datetime.now(timezone.utc).isoformat())
    updated_at: str = Field(default_factory=lambda: datetime.now(timezone.utc).isoformat())
    assigned_at: str | None = None
    resolved_at: str | None = None
    
    # Estimated wait time in minutes (calculated based on queue position)
    estimated_wait: int | None = None


class HandoffQueue:
    """
    Manages the queue of pending handoff requests.
    
    In production, this would be backed by Redis or a database.
    For demo purposes, we use in-memory storage.
    """
    
    def __init__(self):
        self._queue: dict[str, HandoffRequest] = {}
        self._by_conversation: dict[str, str] = {}  # conversation_id -> request_id
    
    def add(
        self,
        conversation_id: str,
        customer_id: str,
        priority: str,
        triggers: list[str],
        reason: str,
        ticket_id: str | None = None,
    ) -> HandoffRequest:
        """Add a new handoff request to the queue."""
        
        # Check if conversation already has a pending request
        if conversation_id in self._by_conversation:
            existing_id = self._by_conversation[conversation_id]
            existing = self._queue.get(existing_id)
            if existing and existing.status == HandoffStatus.QUEUED:
                logger.info("handoff_already_queued", conversation_id=conversation_id)
                return existing
        
        request = HandoffRequest(
            conversation_id=conversation_id,
            customer_id=customer_id,
            ticket_id=ticket_id,
            priority=priority,
            triggers=triggers,
            reason=reason,
            estimated_wait=self._estimate_wait_time(priority),
        )
        
        self._queue[request.request_id] = request
        self._by_conversation[conversation_id] = request.request_id
        
        logger.info(
            "handoff_queued",
            request_id=request.request_id,
            conversation_id=conversation_id,
            priority=priority,
            position=self.get_queue_position(request.request_id)
        )
        
        return request
    
    def _estimate_wait_time(self, priority: str) -> int:
        """Estimate wait time based on queue and priority."""
        
        # Count requests ahead in queue by priority
        queued = [r for r in self._queue.values() if r.status == HandoffStatus.QUEUED]
        
        # Priority weights (urgent gets served first)
        priority_order = {"urgent": 0, "high": 1, "medium": 2, "low": 3}
        my_priority = priority_order.get(priority, 2)
        
        ahead_count = sum(
            1 for r in queued 
            if priority_order.get(r.priority, 2) <= my_priority
        )
        
        # Assume ~5 minutes per request ahead
        base_wait = ahead_count * 5
        
        # Adjust by priority
        if priority == "urgent":
            return max(2, base_wait // 2)
        elif priority == "high":
            return max(5, base_wait)
        else:
            return base_wait + 5
    
    def get(self, request_id: str) -> HandoffRequest | None:
        """Get a handoff request by ID."""
        return self._queue.get(request_id)
    
    def get_by_conversation(self, conversation_id: str) -> HandoffRequest | None:
        """Get the handoff request for a conversation."""
        request_id = self._by_conversation.get(conversation_id)
        if request_id:
            return self._queue.get(request_id)
        return None
    
    def get_queue_position(self, request_id: str) -> int:
        """Get position in queue (1-indexed)."""
        request = self._queue.get(request_id)
        if not request or request.status != HandoffStatus.QUEUED:
            return 0
        
        # Sort by priority then by created_at
        priority_order = {"urgent": 0, "high": 1, "medium": 2, "low": 3}
        
        queued = [
            r for r in self._queue.values() 
            if r.status == HandoffStatus.QUEUED
        ]
        queued.sort(key=lambda r: (priority_order.get(r.priority, 2), r.created_at))
        
        for i, r in enumerate(queued, 1):
            if r.request_id == request_id:
                return i
        
        return 0
    
    def assign(self, request_id: str, agent_id: str) -> HandoffRequest | None:
        """Assign a request to a human agent."""
        request = self._queue.get(request_id)
        if not request:
            return None
        
        request.status = HandoffStatus.ASSIGNED
        request.assigned_agent = agent_id
        request.assigned_at = datetime.now(timezone.utc).isoformat()
        request.updated_at = datetime.now(timezone.utc).isoformat()
        
        logger.info(
            "handoff_assigned",
            request_id=request_id,
            agent_id=agent_id
        )
        
        return request
    
    def resolve(self, request_id: str, resolution: str = "resolved") -> HandoffRequest | None:
        """Mark a handoff request as resolved."""
        request = self._queue.get(request_id)
        if not request:
            return None
        
        request.status = HandoffStatus.RESOLVED
        request.resolved_at = datetime.now(timezone.utc).isoformat()
        request.updated_at = datetime.now(timezone.utc).isoformat()
        
        # Clean up conversation mapping
        if request.conversation_id in self._by_conversation:
            del self._by_conversation[request.conversation_id]
        
        logger.info("handoff_resolved", request_id=request_id)
        
        return request
    
    def get_pending_count(self) -> dict[str, int]:
        """Get count of pending requests by priority."""
        counts = {"urgent": 0, "high": 0, "medium": 0, "low": 0}
        
        for request in self._queue.values():
            if request.status == HandoffStatus.QUEUED:
                counts[request.priority] = counts.get(request.priority, 0) + 1
        
        return counts


# Singleton instance
_handoff_queue: HandoffQueue | None = None


def get_handoff_queue() -> HandoffQueue:
    """Get or create the global handoff queue."""
    global _handoff_queue
    if _handoff_queue is None:
        _handoff_queue = HandoffQueue()
    return _handoff_queue


### handoff/context.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.5
# File: src/caspar/handoff/context.py

"""
Context Packaging for Human Agents

Prepares comprehensive context to help human agents
quickly understand and resolve customer issues.
"""

from datetime import datetime, timezone
from pydantic import BaseModel
from langchain_core.messages import HumanMessage, AIMessage

from caspar.config import get_logger

logger = get_logger(__name__)


class ConversationContext(BaseModel):
    """Complete context package for human agent."""
    
    # Identification
    conversation_id: str
    customer_id: str
    request_id: str
    
    # Customer Info
    customer_name: str | None = None
    customer_email: str | None = None
    customer_tier: str | None = None
    customer_history: str | None = None  # Brief history summary
    
    # Conversation Summary
    conversation_summary: str
    message_count: int
    conversation_duration: str | None = None
    
    # Issue Details
    detected_intent: str
    escalation_triggers: list[str]
    escalation_reason: str
    sentiment_score: float
    frustration_level: str
    
    # Relevant Data
    order_info: dict | None = None
    ticket_id: str | None = None
    retrieved_knowledge: str | None = None
    
    # Full Transcript
    transcript: list[dict]
    
    # Recommendations
    suggested_actions: list[str]
    
    # Metadata
    packaged_at: str


def package_context_for_agent(
    state: dict,
    request_id: str,
    customer_info: dict | None = None,
) -> ConversationContext:
    """
    Package all relevant context for a human agent.
    
    Args:
        state: Current agent state
        request_id: The handoff request ID
        customer_info: Optional customer account info
        
    Returns:
        ConversationContext with all relevant information
    """
    messages = state.get("messages") or []
    
    # Build transcript
    transcript = []
    for msg in messages:
        transcript.append({
            "role": "customer" if isinstance(msg, HumanMessage) else "caspar",
            "content": msg.content,
        })
    
    # Generate conversation summary
    summary = _generate_summary(messages)
    
    # Extract customer info if provided
    customer_name = None
    customer_email = None
    customer_tier = None
    customer_history = None
    
    if customer_info:
        customer_name = customer_info.get("name")
        customer_email = customer_info.get("email")
        customer_tier = customer_info.get("loyalty_tier")
        customer_history = f"{customer_info.get('total_orders', 0)} orders, ${customer_info.get('total_spent', 0):,.2f} total"
    
    # Generate suggested actions based on intent and triggers
    suggested_actions = _generate_suggestions(state)
    
    context = ConversationContext(
        conversation_id=state.get("conversation_id") or "unknown",
        customer_id=state.get("customer_id") or "unknown",
        request_id=request_id,
        customer_name=customer_name,
        customer_email=customer_email,
        customer_tier=customer_tier,
        customer_history=customer_history,
        conversation_summary=summary,
        message_count=len(messages),
        detected_intent=state.get("intent") or "unknown",
        escalation_triggers=state.get("escalation_triggers") or [],
        escalation_reason=state.get("escalation_reason") or "Unknown",
        sentiment_score=state.get("sentiment_score") or 0.0,
        frustration_level=state.get("frustration_level") or "unknown",
        order_info=state.get("order_info"),
        ticket_id=state.get("ticket_id"),
        retrieved_knowledge=state.get("retrieved_context"),
        transcript=transcript,
        suggested_actions=suggested_actions,
        packaged_at=datetime.now(timezone.utc).isoformat(),
    )
    
    logger.info(
        "context_packaged",
        conversation_id=context.conversation_id,
        message_count=context.message_count
    )
    
    return context


def _generate_summary(messages: list) -> str:
    """Generate a brief summary of the conversation."""
    
    if not messages:
        return "No messages in conversation."
    
    # Get first customer message (the initial inquiry)
    first_customer_msg = None
    for msg in messages:
        if isinstance(msg, HumanMessage):
            first_customer_msg = msg.content
            break
    
    # Get last customer message (most recent concern)
    last_customer_msg = None
    for msg in reversed(messages):
        if isinstance(msg, HumanMessage):
            last_customer_msg = msg.content
            break
    
    summary_parts = []
    
    if first_customer_msg:
        # Truncate if too long
        initial = first_customer_msg[:150] + "..." if len(first_customer_msg) > 150 else first_customer_msg
        summary_parts.append(f"Initial inquiry: {initial}")
    
    if last_customer_msg and last_customer_msg != first_customer_msg:
        recent = last_customer_msg[:150] + "..." if len(last_customer_msg) > 150 else last_customer_msg
        summary_parts.append(f"Most recent message: {recent}")
    
    summary_parts.append(f"Total exchanges: {len(messages)} messages")
    
    return "\n".join(summary_parts)


def _generate_suggestions(state: dict) -> list[str]:
    """Generate suggested actions for the human agent."""
    
    suggestions = []
    intent = state.get("intent") or ""
    triggers = state.get("escalation_triggers") or []
    
    # Intent-based suggestions
    if intent == "complaint":
        suggestions.append("Acknowledge the customer's frustration and apologize for the inconvenience")
        suggestions.append("Review order history for context")
        
    if intent == "order_inquiry":
        suggestions.append("Verify order status in the system")
        suggestions.append("Check for any shipping delays or issues")
    
    # Trigger-based suggestions
    if "high_frustration" in triggers:
        suggestions.append("⚠️ Customer is highly frustrated - prioritize empathy")
        suggestions.append("Consider offering a goodwill gesture (discount, expedited shipping)")
    
    if "vip_customer" in triggers:
        suggestions.append("⭐ VIP Customer - consider premium resolution options")
    
    if "policy_exception" in triggers:
        suggestions.append("This may require manager approval for policy exception")
    
    # Order-specific suggestions
    order_info = state.get("order_info") or {}
    if order_info.get("status") == "processing":
        suggestions.append("Order is still processing - can offer to expedite if needed")
    elif order_info.get("status") == "shipped":
        suggestions.append("Order is in transit - check tracking for delays")
    
    # Default suggestions
    if not suggestions:
        suggestions.append("Review the conversation transcript for context")
        suggestions.append("Ask clarifying questions if needed")
    
    return suggestions


def format_context_for_display(context: ConversationContext) -> str:
    """Format context as readable text for agent interface."""
    
    lines = [
        "=" * 60,
        "🎫 HANDOFF CONTEXT",
        "=" * 60,
        "",
        f"Request ID: {context.request_id}",
        f"Conversation: {context.conversation_id}",
        f"Customer: {context.customer_id}",
        "",
    ]
    
    # Customer info if available
    if context.customer_name:
        lines.append("👤 CUSTOMER INFO")
        lines.append("-" * 40)
        lines.append(f"Name: {context.customer_name}")
        if context.customer_email:
            lines.append(f"Email: {context.customer_email}")
        if context.customer_tier:
            lines.append(f"Tier: {context.customer_tier.upper()}")
        if context.customer_history:
            lines.append(f"History: {context.customer_history}")
        lines.append("")
    
    # Issue summary
    lines.append("📋 ISSUE SUMMARY")
    lines.append("-" * 40)
    lines.append(f"Intent: {context.detected_intent}")
    lines.append(f"Sentiment: {context.sentiment_score:.2f} ({context.frustration_level} frustration)")
    lines.append(f"Reason: {context.escalation_reason}")
    lines.append("")
    
    # Suggested actions
    lines.append("💡 SUGGESTED ACTIONS")
    lines.append("-" * 40)
    for action in context.suggested_actions:
        lines.append(f"  • {action}")
    lines.append("")
    
    # Conversation summary
    lines.append("📝 CONVERSATION SUMMARY")
    lines.append("-" * 40)
    lines.append(context.conversation_summary)
    lines.append("")
    
    # Transcript
    lines.append("💬 TRANSCRIPT")
    lines.append("-" * 40)
    for msg in context.transcript:
        role = "Customer" if msg["role"] == "customer" else "CASPAR"
        content = msg["content"][:200] + "..." if len(msg["content"]) > 200 else msg["content"]
        lines.append(f"{role}: {content}")
        lines.append("")
    
    lines.append("=" * 60)
    
    return "\n".join(lines)


### handoff/notifications.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.5
# File: src/caspar/handoff/notifications.py

"""
Agent Notification System

Notifies available human agents about pending handoffs.
In production, this would integrate with Slack, email, or a dashboard.
"""

from datetime import datetime, timezone
from pydantic import BaseModel

from caspar.config import get_logger
from .queue import HandoffRequest
from .context import ConversationContext

logger = get_logger(__name__)


class AgentNotification(BaseModel):
    """A notification sent to human agents."""
    
    notification_id: str
    request_id: str
    priority: str
    customer_id: str
    brief_reason: str
    estimated_wait: int | None
    sent_at: str
    channel: str  # "dashboard", "slack", "email"


# Simulated agent pool
AVAILABLE_AGENTS = [
    {"id": "AGENT-001", "name": "Sarah Johnson", "status": "available", "skills": ["technical", "billing"]},
    {"id": "AGENT-002", "name": "Mike Chen", "status": "available", "skills": ["returns", "shipping"]},
    {"id": "AGENT-003", "name": "Emily Davis", "status": "busy", "skills": ["vip", "complaints"]},
]


def get_available_agents(required_skills: list[str] | None = None) -> list[dict]:
    """Get list of available agents, optionally filtered by skills."""
    
    available = [a for a in AVAILABLE_AGENTS if a["status"] == "available"]
    
    if required_skills:
        available = [
            a for a in available
            if any(skill in a["skills"] for skill in required_skills)
        ]
    
    return available


def notify_available_agents(
    request: HandoffRequest,
    context: ConversationContext | None = None,
) -> list[AgentNotification]:
    """
    Notify available agents about a new handoff request.
    
    In production, this would:
    - Send Slack messages to a support channel
    - Update a real-time dashboard
    - Send push notifications to mobile apps
    - Trigger phone alerts for urgent requests
    
    Args:
        request: The handoff request
        context: Optional conversation context
        
    Returns:
        List of notifications sent
    """
    notifications = []
    
    # Determine required skills based on triggers
    required_skills = []
    if "vip_customer" in request.triggers:
        required_skills.append("vip")
    if "complaint" in str(request.reason).lower():
        required_skills.append("complaints")
    
    # Get available agents
    agents = get_available_agents(required_skills)
    
    if not agents:
        # Fall back to all available agents
        agents = get_available_agents()
    
    # Create notification content
    brief_reason = request.reason[:100] + "..." if len(request.reason) > 100 else request.reason
    
    for agent in agents:
        notification = AgentNotification(
            notification_id=f"NOTIF-{request.request_id}-{agent['id']}",
            request_id=request.request_id,
            priority=request.priority,
            customer_id=request.customer_id,
            brief_reason=brief_reason,
            estimated_wait=request.estimated_wait,
            sent_at=datetime.now(timezone.utc).isoformat(),
            channel="dashboard",
        )
        
        notifications.append(notification)
        
        # Log the "notification" (in production, this would actually send)
        logger.info(
            "agent_notified",
            agent_id=agent["id"],
            agent_name=agent["name"],
            request_id=request.request_id,
            priority=request.priority
        )
        
        # Simulate different notification channels based on priority
        if request.priority == "urgent":
            _send_urgent_notification(agent, request, brief_reason)
        else:
            _send_standard_notification(agent, request, brief_reason)
    
    return notifications


def _send_urgent_notification(agent: dict, request: HandoffRequest, reason: str):
    """Simulate urgent notification (would trigger alerts)."""
    print(f"\n🚨 URGENT HANDOFF ALERT for {agent['name']}!")
    print(f"   Customer: {request.customer_id}")
    print(f"   Reason: {reason}")
    print(f"   → Immediate attention required\n")


def _send_standard_notification(agent: dict, request: HandoffRequest, reason: str):
    """Simulate standard notification (would update dashboard)."""
    print(f"\n📋 New handoff request for {agent['name']}")
    print(f"   Priority: {request.priority.upper()}")
    print(f"   Customer: {request.customer_id}")
    print(f"   Reason: {reason}\n")


### handoff/approval.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.5
# File: src/caspar/handoff/approval.py

"""
Human-in-the-Loop response approval.

This module enables human review of AI responses before they're sent,
useful for high-stakes or sensitive situations.
"""

from dataclasses import dataclass
from datetime import datetime, timezone
from enum import Enum

from caspar.config import get_logger

logger = get_logger(__name__)


class ApprovalStatus(Enum):
    """Status of a pending approval."""
    PENDING = "pending"
    APPROVED = "approved"
    REJECTED = "rejected"
    EDITED = "edited"


@dataclass
class PendingApproval:
    """A response waiting for human approval."""
    conversation_id: str
    original_response: str
    reason: str  # Why approval is needed
    created_at: datetime
    status: ApprovalStatus = ApprovalStatus.PENDING
    reviewer_id: str | None = None
    edited_response: str | None = None
    reviewed_at: datetime | None = None


def needs_approval(state: dict) -> bool:
    """
    Determine if a response needs human approval before sending.
    
    This is checked BEFORE the response is sent to the customer.
    """
    # High-value actions need approval
    if state.get("pending_refund_amount", 0) > 100:
        return True
    
    # Policy exceptions need approval
    if state.get("policy_exception_requested"):
        return True
    
    # Very negative sentiment needs human review
    sentiment = state.get("sentiment_score", 0)
    if sentiment < -0.7:
        return True
    
    # New customers with complaints
    if state.get("intent") == "complaint" and state.get("customer_tenure_days", 365) < 30:
        return True
    
    return False


def get_approval_reason(state: dict) -> str:
    """Get a human-readable reason for why approval is needed."""
    reasons = []
    
    if state.get("pending_refund_amount", 0) > 100:
        amount = state.get("pending_refund_amount")
        reasons.append(f"High-value refund: ${amount}")
    
    if state.get("policy_exception_requested"):
        reasons.append("Policy exception requested")
    
    if state.get("sentiment_score", 0) < -0.7:
        reasons.append("Customer appears very upset")
    
    if state.get("intent") == "complaint" and state.get("customer_tenure_days", 365) < 30:
        reasons.append("New customer complaint - retention risk")
    
    return "; ".join(reasons) if reasons else "Manual review requested"


### agent/nodes_handoff_update.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.5
# File: src/caspar/agent/nodes_handoff_update.py

"""
Updated nodes for human handoff functionality.

These functions extend the agent with handoff support:
- check_sentiment: Analyze customer emotion and detect escalation needs
- human_handoff: Handle the transition to a human agent
"""

from datetime import datetime, timezone

from langchain_openai import ChatOpenAI
from langchain_core.messages import AIMessage, HumanMessage

from caspar.config import settings, get_logger
from caspar.handoff import (
    check_escalation_triggers,
    get_handoff_queue,
    package_context_for_agent,
    notify_available_agents,
    format_context_for_display,
    check_sensitive_topics,
)
from caspar.tools import get_account_info, create_ticket

logger = get_logger(__name__)


async def check_sentiment(state: dict) -> dict:
    """
    Analyze customer sentiment and check all escalation triggers.
    
    This node runs after intent handlers to determine if:
    1. The customer is frustrated (sentiment analysis)
    2. Sensitive topics are detected
    3. Escalation to a human is needed
    """
    logger.info("check_sentiment_start", conversation_id=state.get("conversation_id"))
    
    messages = state["messages"]
    if not messages:
        return {
            "sentiment_score": 0.0,
            "frustration_level": "low",
            "last_updated": datetime.now(timezone.utc).isoformat()
        }
    
    # Get last few messages for context
    recent_messages = messages[-3:] if len(messages) >= 3 else messages
    conversation_text = "\n".join([
        f"{'Customer' if isinstance(m, HumanMessage) else 'Agent'}: {m.content}"
        for m in recent_messages
    ])
    
    llm = ChatOpenAI(
        model=settings.default_model,
        api_key=settings.openai_api_key,
        temperature=0
    )
    
    sentiment_prompt = f"""Analyze the customer's emotional state in this conversation.

Conversation:
{conversation_text}

Provide your analysis in this exact format:
SENTIMENT: [number from -1.0 to 1.0, where -1 is very negative, 0 is neutral, 1 is very positive]
FRUSTRATION: [low, medium, or high]"""

    response = llm.invoke([HumanMessage(content=sentiment_prompt)])
    
    # Parse response
    sentiment_score = 0.0
    frustration_level = "low"
    
    for line in response.content.strip().split("\n"):
        if line.startswith("SENTIMENT:"):
            try:
                sentiment_score = float(line.split(":")[1].strip())
                sentiment_score = max(-1.0, min(1.0, sentiment_score))
            except ValueError:
                pass
        elif line.startswith("FRUSTRATION:"):
            level = line.split(":")[1].strip().lower()
            if level in ["low", "medium", "high"]:
                frustration_level = level
    
    result = {
        "sentiment_score": sentiment_score,
        "frustration_level": frustration_level,
        "last_updated": datetime.now(timezone.utc).isoformat()
    }
    
    # Check for sensitive topics in the last message
    last_message = messages[-1].content if messages else ""
    if check_sensitive_topics(last_message):
        result["needs_escalation"] = True
        result["escalation_reason"] = "Sensitive topic detected - requires human handling"
        logger.warning("sensitive_topic_detected", conversation_id=state.get("conversation_id"))
    
    # Check if escalation needed based on sentiment
    elif sentiment_score < settings.sentiment_threshold or frustration_level == "high":
        result["needs_escalation"] = True
        result["escalation_reason"] = f"High frustration detected (sentiment: {sentiment_score}, frustration: {frustration_level})"
        logger.warning("escalation_triggered", conversation_id=state.get("conversation_id"))
    
    logger.info(
        "check_sentiment_complete",
        sentiment_score=sentiment_score,
        frustration_level=frustration_level
    )
    
    return result


async def human_handoff(state: dict) -> dict:
    """
    Handle escalation to a human agent.
    
    This node:
    1. Checks escalation triggers
    2. Creates a handoff request
    3. Packages context for the human agent
    4. Notifies available agents
    5. Informs the customer
    """
    logger.info("human_handoff_start", conversation_id=state.get("conversation_id"))
    
    customer_id = state.get("customer_id") or "UNKNOWN"
    conversation_id = state.get("conversation_id")
    
    # Get customer info for context
    customer_info = None
    if customer_id != "UNKNOWN":
        account_result = get_account_info(customer_id)
        if account_result["found"]:
            customer_info = account_result["account"]
    
    # Check escalation triggers
    customer_tier = customer_info.get("loyalty_tier") if customer_info else None
    escalation_result = check_escalation_triggers(state, customer_tier)
    
    # Create ticket for tracking
    ticket_result = create_ticket(
        customer_id=customer_id,
        category="general",
        subject="Human Agent Requested",
        description=escalation_result.reason,
        priority=escalation_result.priority,
        conversation_id=conversation_id,
    )
    
    # Add to handoff queue
    queue = get_handoff_queue()
    handoff_request = queue.add(
        conversation_id=conversation_id,
        customer_id=customer_id,
        priority=escalation_result.priority,
        triggers=[t.value for t in escalation_result.triggers],
        reason=escalation_result.reason,
        ticket_id=ticket_result["ticket"]["ticket_id"],
    )
    
    # Package context for human agent
    state_with_triggers = {
        **state,
        "escalation_triggers": [t.value for t in escalation_result.triggers],
    }
    context = package_context_for_agent(
        state=state_with_triggers,
        request_id=handoff_request.request_id,
        customer_info=customer_info,
    )
    
    # Notify available agents
    notifications = notify_available_agents(handoff_request, context)
    
    # Log the full context (in production, this would go to the agent dashboard)
    context_display = format_context_for_display(context)
    logger.info("handoff_context_prepared", context_length=len(context_display))
    
    # Build customer-facing message
    position = queue.get_queue_position(handoff_request.request_id)
    wait_time = handoff_request.estimated_wait or 5
    
    handoff_message = _build_handoff_message(
        ticket_id=ticket_result["ticket"]["ticket_id"],
        position=position,
        wait_time=wait_time,
        priority=escalation_result.priority,
    )
    
    logger.info(
        "human_handoff_complete",
        request_id=handoff_request.request_id,
        ticket_id=ticket_result["ticket"]["ticket_id"],
        agents_notified=len(notifications)
    )
    
    return {
        "messages": [AIMessage(content=handoff_message)],
        "needs_escalation": True,
        "escalation_reason": escalation_result.reason,
        "ticket_id": ticket_result["ticket"]["ticket_id"],
        "last_updated": datetime.now(timezone.utc).isoformat()
    }


def _build_handoff_message(
    ticket_id: str,
    position: int,
    wait_time: int,
    priority: str,
) -> str:
    """Build the customer-facing handoff message."""
    
    priority_messages = {
        "urgent": "I've flagged this as urgent, and a team member will be with you very shortly.",
        "high": "I've marked this as high priority. A team member will be with you soon.",
        "medium": "A team member will be with you as soon as possible.",
        "low": "A team member will reach out to help you.",
    }
    
    message_parts = [
        "I understand you'd like to speak with a human agent, and I've arranged that for you.",
        "",
        f"**Your Reference Number: {ticket_id}**",
        "",
        priority_messages.get(priority, priority_messages["medium"]),
        "",
    ]
    
    if position > 0:
        message_parts.append(f"You're currently #{position} in our queue.")
    
    message_parts.extend([
        f"Estimated wait time: approximately {wait_time} minutes.",
        "",
        "While you wait:",
        "• You don't need to stay on this chat - we'll reach out to you",
        "• You can reference your ticket number in any follow-up",
        "• Our team has the full context of our conversation",
        "",
        "Is there anything else I can help you with while you wait?",
    ])
    
    return "\n".join(message_parts)


### scripts/test_handoff.py

In [None]:
# From: Zero to AI Agent, Chapter 20, Section 20.5
# File: scripts/test_handoff.py

"""Test the human handoff system."""

import asyncio
from langchain_core.messages import HumanMessage

from caspar.agent import create_agent, create_initial_state
from caspar.handoff import get_handoff_queue, format_context_for_display
from caspar.config import setup_logging, get_logger

setup_logging()
logger = get_logger(__name__)


async def test_explicit_handoff():
    """Test explicit request for human agent."""
    print("\n" + "=" * 60)
    print("🧪 Test: Explicit Handoff Request")
    print("=" * 60)
    
    agent = await create_agent()
    state = create_initial_state(
        conversation_id="test-handoff-explicit",
        customer_id="CUST-1000"
    )
    state["messages"] = [HumanMessage(content="I want to talk to a real person please")]
    
    config = {"configurable": {"thread_id": "test-handoff-explicit"}}
    result = await agent.ainvoke(state, config)
    
    print(f"Intent: {result['intent']}")
    print(f"Escalated: {result.get('needs_escalation')}")
    print(f"Ticket: {result.get('ticket_id')}")
    print(f"\nCASPAR Response:\n{result['messages'][-1].content}")
    
    # Check queue
    queue = get_handoff_queue()
    request = queue.get_by_conversation("test-handoff-explicit")
    if request:
        print(f"\n📋 Queue Position: {queue.get_queue_position(request.request_id)}")
        print(f"   Priority: {request.priority}")
        print(f"   Est. Wait: {request.estimated_wait} minutes")


async def test_frustration_escalation():
    """Test escalation triggered by frustration."""
    print("\n" + "=" * 60)
    print("🧪 Test: Frustration-Triggered Escalation")
    print("=" * 60)
    
    agent = await create_agent()
    state = create_initial_state(
        conversation_id="test-handoff-frustration",
        customer_id="CUST-1001"
    )
    
    # Simulate a frustrated customer
    state["messages"] = [
        HumanMessage(content="Where is my order?! I've been waiting for weeks!"),
    ]
    
    config = {"configurable": {"thread_id": "test-handoff-frustration"}}
    result = await agent.ainvoke(state, config)
    
    print(f"Intent: {result['intent']}")
    print(f"Sentiment: {result.get('sentiment_score')}")
    print(f"Frustration: {result.get('frustration_level')}")
    print(f"Escalated: {result.get('needs_escalation')}")
    print(f"\nCASPAR Response:\n{result['messages'][-1].content[:300]}...")


async def test_vip_customer():
    """Test VIP customer gets priority handling."""
    print("\n" + "=" * 60)
    print("🧪 Test: VIP Customer Handling")
    print("=" * 60)
    
    agent = await create_agent()
    
    # CUST-1003 is a gold tier customer in our mock data
    state = create_initial_state(
        conversation_id="test-handoff-vip",
        customer_id="CUST-1003"
    )
    state["messages"] = [
        HumanMessage(content="I have an issue with my recent order and I'm not happy about it."),
    ]
    
    config = {"configurable": {"thread_id": "test-handoff-vip"}}
    result = await agent.ainvoke(state, config)
    
    print(f"Intent: {result['intent']}")
    print(f"Escalated: {result.get('needs_escalation')}")
    
    queue = get_handoff_queue()
    request = queue.get_by_conversation("test-handoff-vip")
    if request:
        print(f"Priority: {request.priority}")
        print(f"Triggers: {request.triggers}")


async def test_sensitive_topic():
    """Test sensitive topic detection."""
    print("\n" + "=" * 60)
    print("🧪 Test: Sensitive Topic Detection")
    print("=" * 60)
    
    agent = await create_agent()
    state = create_initial_state(
        conversation_id="test-handoff-sensitive",
        customer_id="CUST-1000"
    )
    state["messages"] = [
        HumanMessage(content="I think this might be fraud. Someone used my card without permission."),
    ]
    
    config = {"configurable": {"thread_id": "test-handoff-sensitive"}}
    result = await agent.ainvoke(state, config)
    
    print(f"Intent: {result['intent']}")
    print(f"Escalated: {result.get('needs_escalation')}")
    print(f"Reason: {result.get('escalation_reason', 'N/A')}")
    print(f"\nCASPAR Response:\n{result['messages'][-1].content[:300]}...")


async def test_queue_management():
    """Test queue management with multiple requests."""
    print("\n" + "=" * 60)
    print("🧪 Test: Queue Management")
    print("=" * 60)
    
    queue = get_handoff_queue()
    
    # Add several requests with different priorities
    requests = [
        ("conv-1", "CUST-1000", "medium", ["general"], "General inquiry"),
        ("conv-2", "CUST-1001", "urgent", ["explicit_request"], "Customer requested agent"),
        ("conv-3", "CUST-1002", "high", ["vip_customer"], "VIP with issue"),
        ("conv-4", "CUST-1003", "low", ["general"], "Simple question"),
    ]
    
    for conv_id, cust_id, priority, triggers, reason in requests:
        queue.add(conv_id, cust_id, priority, triggers, reason)
    
    print("\n📊 Queue Status:")
    counts = queue.get_pending_count()
    for priority, count in counts.items():
        print(f"   {priority.upper()}: {count}")
    
    print("\n📋 Queue Order (by priority):")
    for conv_id, _, _, _, _ in requests:
        req = queue.get_by_conversation(conv_id)
        if req:
            pos = queue.get_queue_position(req.request_id)
            print(f"   #{pos}: {req.conversation_id} ({req.priority})")


async def main():
    """Run all handoff tests."""
    
    await test_explicit_handoff()
    await test_frustration_escalation()
    await test_vip_customer()
    await test_sensitive_topic()
    await test_queue_management()
    
    print("\n" + "=" * 60)
    print("✅ All handoff tests complete!")
    print("=" * 60)


if __name__ == "__main__":
    asyncio.run(main())


---
## Section 20.6: Testing and refinement

This section covers testing the agent with unit tests, integration tests, and evaluation.

**Key files:**
- `tests/conftest.py` - Test fixtures
- `tests/unit/` - Unit tests for individual components
- `tests/integration/` - Integration tests for conversation flows
- `tests/evaluation/` - Quality evaluation tests
- `scripts/run_tests.py` - Test runner script

### tests/conftest.py

In [None]:
"""
Shared test fixtures and utilities.

pytest automatically discovers this file and makes fixtures
available to all tests.
"""

import pytest
import asyncio
from unittest.mock import MagicMock, AsyncMock
from langchain_core.messages import HumanMessage, AIMessage

from caspar.agent import create_initial_state


@pytest.fixture
def sample_state():
    """Create a sample agent state for testing."""
    return create_initial_state(
        conversation_id="test-conv-001",
        customer_id="CUST-1000"
    )


@pytest.fixture
def sample_state_with_messages():
    """Create a state with some conversation history."""
    state = create_initial_state(
        conversation_id="test-conv-002",
        customer_id="CUST-1000"
    )
    state["messages"] = [
        HumanMessage(content="Hi, I have a question about my order"),
        AIMessage(content="Hello! I'd be happy to help with your order. Could you provide your order number?"),
        HumanMessage(content="It's TF-10001"),
    ]
    return state


@pytest.fixture
def mock_llm():
    """Create a mock LLM for testing without API calls."""
    mock = MagicMock()
    mock.invoke = MagicMock(return_value=MagicMock(content="mocked response"))
    return mock


@pytest.fixture
def event_loop():
    """Create an event loop for async tests."""
    loop = asyncio.new_event_loop()
    yield loop
    loop.close()


### tests/unit/test_tools_orders.py

In [None]:
"""Unit tests for the order lookup tool."""

import pytest
from caspar.tools.orders import (
    OrderLookupTool,
    get_order_status,
    OrderInfo,
)


class TestOrderLookupTool:
    """Tests for OrderLookupTool class."""
    
    def setup_method(self):
        """Set up a fresh tool instance for each test."""
        self.tool = OrderLookupTool()
    
    def test_lookup_existing_order(self):
        """Should find an order that exists."""
        order = self.tool.lookup("TF-10001")
        
        assert order is not None
        assert order.order_id == "TF-10001"
        assert order.status in ["processing", "shipped", "delivered", "cancelled", "returned"]
    
    def test_lookup_nonexistent_order(self):
        """Should return None for orders that don't exist."""
        order = self.tool.lookup("TF-99999")
        
        assert order is None
    
    def test_lookup_normalizes_order_id(self):
        """Should handle order IDs without the TF- prefix."""
        order = self.tool.lookup("10001")
        
        assert order is not None
        assert order.order_id == "TF-10001"
    
    def test_lookup_case_insensitive(self):
        """Should handle lowercase order IDs."""
        order = self.tool.lookup("tf-10001")
        
        assert order is not None
        assert order.order_id == "TF-10001"
    
    def test_lookup_with_customer_verification(self):
        """Should verify customer ownership when customer_id provided."""
        # TF-10001 belongs to CUST-1001 in our mock data
        order = self.tool.lookup("TF-10001", customer_id="CUST-1001")
        
        assert order is not None
    
    def test_lookup_wrong_customer_returns_none(self):
        """Should return None if customer doesn't own the order."""
        # TF-10001 belongs to CUST-1001, not CUST-1002
        order = self.tool.lookup("TF-10001", customer_id="CUST-9999")
        
        assert order is None


### tests/unit/test_tools_tickets.py

In [None]:
"""Unit tests for the ticket creation tool."""

import pytest
from caspar.tools.tickets import (
    TicketTool,
    create_ticket,
    Ticket,
)


class TestTicketTool:
    """Tests for TicketTool class."""
    
    def setup_method(self):
        """Set up a fresh tool instance for each test."""
        self.tool = TicketTool()
    
    def test_create_ticket_returns_ticket(self):
        """Should create and return a ticket."""
        ticket = self.tool.create(
            customer_id="CUST-1000",
            category="technical",
            subject="Test ticket",
            description="This is a test",
        )
        
        assert ticket is not None
        assert ticket.ticket_id.startswith("TKT-")
        assert ticket.customer_id == "CUST-1000"
        assert ticket.category == "technical"
        assert ticket.status == "open"
    
    def test_create_ticket_with_priority(self):
        """Should respect priority setting."""
        ticket = self.tool.create(
            customer_id="CUST-1000",
            category="billing",
            subject="Urgent issue",
            description="Very urgent",
            priority="urgent",
        )
        
        assert ticket.priority == "urgent"
    
    def test_create_ticket_default_priority(self):
        """Should default to medium priority."""
        ticket = self.tool.create(
            customer_id="CUST-1000",
            category="general",
            subject="General question",
            description="Just asking",
        )
        
        assert ticket.priority == "medium"
    
    def test_get_ticket_by_id(self):
        """Should retrieve ticket by ID."""
        created = self.tool.create(
            customer_id="CUST-1000",
            category="return",
            subject="Return request",
            description="Want to return item",
        )
        
        retrieved = self.tool.get(created.ticket_id)
        
        assert retrieved is not None
        assert retrieved.ticket_id == created.ticket_id
    
    def test_get_nonexistent_ticket(self):
        """Should return None for tickets that don't exist."""
        result = self.tool.get("TKT-NONEXISTENT")
        
        assert result is None
    
    def test_get_customer_tickets(self):
        """Should retrieve all tickets for a customer."""
        # Create multiple tickets
        self.tool.create(
            customer_id="CUST-TEST",
            category="technical",
            subject="Issue 1",
            description="First issue",
        )
        self.tool.create(
            customer_id="CUST-TEST",
            category="billing",
            subject="Issue 2",
            description="Second issue",
        )
        self.tool.create(
            customer_id="CUST-OTHER",
            category="general",
            subject="Other customer",
            description="Different customer",
        )
        
        tickets = self.tool.get_customer_tickets("CUST-TEST")
        
        assert len(tickets) == 2
        assert all(t.customer_id == "CUST-TEST" for t in tickets)


### tests/unit/test_handoff_triggers.py

In [None]:
"""Unit tests for escalation trigger detection."""

import pytest
from caspar.handoff.triggers import (
    check_escalation_triggers,
    check_sensitive_topics,
    EscalationTrigger,
)


class TestCheckEscalationTriggers:
    """Tests for check_escalation_triggers function."""
    
    def test_explicit_request_triggers_escalation(self):
        """Should trigger on explicit handoff request."""
        state = {"intent": "handoff_request"}
        
        result = check_escalation_triggers(state)
        
        assert result.should_escalate is True
        assert EscalationTrigger.EXPLICIT_REQUEST in result.triggers
        assert result.priority == "urgent"
    
    def test_high_frustration_triggers_escalation(self):
        """Should trigger on high frustration."""
        state = {
            "intent": "complaint",
            "sentiment_score": -0.8,
            "frustration_level": "high",
        }
        
        result = check_escalation_triggers(state)
        
        assert result.should_escalate is True
        assert EscalationTrigger.HIGH_FRUSTRATION in result.triggers
    
    def test_vip_customer_with_complaint_triggers(self):
        """Should trigger for VIP customers with complaints."""
        state = {
            "intent": "complaint",
            "sentiment_score": 0.0,
            "frustration_level": "medium",
        }
        
        result = check_escalation_triggers(state, customer_tier="gold")
        
        assert result.should_escalate is True
        assert EscalationTrigger.VIP_CUSTOMER in result.triggers
    
    def test_no_triggers_when_everything_ok(self):
        """Should not trigger when conversation is normal."""
        state = {
            "intent": "faq",
            "sentiment_score": 0.5,
            "frustration_level": "low",
            "turn_count": 2,
        }
        
        result = check_escalation_triggers(state)
        
        assert result.should_escalate is False
        assert len(result.triggers) == 0


class TestCheckSensitiveTopics:
    """Tests for sensitive topic detection."""
    
    def test_detects_legal_keywords(self):
        """Should detect legal-related keywords."""
        assert check_sensitive_topics("I'm going to sue you") is True
        assert check_sensitive_topics("I'll contact my lawyer") is True
        assert check_sensitive_topics("This is legal action") is True
    
    def test_detects_fraud_keywords(self):
        """Should detect fraud-related keywords."""
        assert check_sensitive_topics("This is fraud!") is True
        assert check_sensitive_topics("Someone scammed me") is True
        assert check_sensitive_topics("My card was stolen") is True
    
    def test_detects_safety_keywords(self):
        """Should detect safety-related keywords."""
        assert check_sensitive_topics("This product is dangerous") is True
        assert check_sensitive_topics("I was injured") is True
    
    def test_ignores_normal_messages(self):
        """Should not trigger on normal messages."""
        assert check_sensitive_topics("Where is my order?") is False
        assert check_sensitive_topics("I want to return this") is False
        assert check_sensitive_topics("What's your return policy?") is False
    
    def test_case_insensitive(self):
        """Should detect keywords regardless of case."""
        assert check_sensitive_topics("FRAUD") is True
        assert check_sensitive_topics("Lawyer") is True


### tests/integration/test_intent_classification.py

In [None]:
"""Integration tests for intent classification."""

import pytest
from langchain_core.messages import HumanMessage

from caspar.agent import create_agent, create_initial_state


@pytest.mark.asyncio
async def test_faq_intent_classification():
    """Should classify FAQ questions correctly."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-faq", customer_id="CUST-1000")
    
    test_cases = [
        "What is your return policy?",
        "How long does shipping take?",
        "Do you offer warranties?",
        "What payment methods do you accept?",
    ]
    
    for message in test_cases:
        state["messages"] = [HumanMessage(content=message)]
        config = {"configurable": {"thread_id": f"test-faq-{hash(message)}"}}
        
        result = await agent.ainvoke(state, config)
        
        assert result["intent"] == "faq", f"Expected 'faq' for: {message}, got: {result['intent']}"


@pytest.mark.asyncio
async def test_order_inquiry_intent_classification():
    """Should classify order inquiries correctly."""
    agent = await create_agent()
    # CUST-1000 owns TF-10000, TF-10005, TF-10010, TF-10015
    state = create_initial_state(conversation_id="test-order", customer_id="CUST-1000")
    
    test_cases = [
        "Where is my order TF-10000?",
        "I want to track my order",
        "What's the status of order 10005?",
        "When will my package arrive?",
    ]
    
    for message in test_cases:
        state["messages"] = [HumanMessage(content=message)]
        config = {"configurable": {"thread_id": f"test-order-{hash(message)}"}}
        
        result = await agent.ainvoke(state, config)
        
        assert result["intent"] == "order_inquiry", f"Expected 'order_inquiry' for: {message}, got: {result['intent']}"


@pytest.mark.asyncio
async def test_complaint_intent_classification():
    """Should classify complaints correctly."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-complaint", customer_id="CUST-1000")
    
    test_cases = [
        "This product is terrible!",
        "I'm very disappointed with my purchase",
        "Your service is awful",
        "My item arrived damaged and I'm furious",
    ]
    
    for message in test_cases:
        state["messages"] = [HumanMessage(content=message)]
        config = {"configurable": {"thread_id": f"test-complaint-{hash(message)}"}}
        
        result = await agent.ainvoke(state, config)
        
        assert result["intent"] == "complaint", f"Expected 'complaint' for: {message}, got: {result['intent']}"


@pytest.mark.asyncio
async def test_handoff_request_classification():
    """Should classify handoff requests correctly."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-handoff", customer_id="CUST-1000")
    
    test_cases = [
        "I want to speak to a human",
        "Let me talk to a real person",
        "Connect me with an agent",
        "I need human support please",
    ]
    
    for message in test_cases:
        state["messages"] = [HumanMessage(content=message)]
        config = {"configurable": {"thread_id": f"test-handoff-{hash(message)}"}}
        
        result = await agent.ainvoke(state, config)
        
        assert result["intent"] == "handoff_request", f"Expected 'handoff_request' for: {message}, got: {result['intent']}"


### tests/integration/test_conversation_flows.py

In [None]:
"""Integration tests for complete conversation flows."""

import pytest
from langchain_core.messages import HumanMessage

from caspar.agent import create_agent, create_initial_state


@pytest.mark.asyncio
async def test_faq_flow_returns_relevant_info():
    """FAQ flow should return relevant policy information."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-faq-flow", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="What is your return policy?")]
    
    config = {"configurable": {"thread_id": "test-faq-flow"}}
    result = await agent.ainvoke(state, config)
    
    response = result["messages"][-1].content.lower()
    
    # Should mention key return policy details
    assert any(word in response for word in ["return", "30", "day", "refund"]), \
        f"Response should mention return policy details: {response}"


@pytest.mark.asyncio
async def test_order_inquiry_with_valid_order():
    """Order inquiry should return order details for valid orders."""
    agent = await create_agent()
    # Use CUST-1000 with TF-10000 (TF-10000 belongs to CUST-1000)
    state = create_initial_state(conversation_id="test-order-flow", customer_id="CUST-1000")
    # Use polite phrasing to reduce chance of sentiment escalation
    state["messages"] = [HumanMessage(content="Hi! Could you please check the status of order TF-10000? Thanks!")]
    
    config = {"configurable": {"thread_id": "test-order-flow"}}
    result = await agent.ainvoke(state, config)
    
    # The order lookup should have succeeded - check the state
    # Note: Even if sentiment triggers escalation, order_info should be populated
    order_info = result.get("order_info")
    
    # order_info should exist and have status (not error) when order is found
    assert order_info is not None, \
        f"Order info should be in state. Got: {order_info}"
    assert "status" in order_info, \
        f"Order should be found (has status). Got: {order_info}"
    assert "error" not in order_info, \
        f"Order lookup should not have error. Got: {order_info}"


@pytest.mark.asyncio
async def test_order_inquiry_with_invalid_order():
    """Order inquiry should handle invalid orders gracefully."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-invalid-order", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="Where is my order TF-99999?")]
    
    config = {"configurable": {"thread_id": "test-invalid-order"}}
    result = await agent.ainvoke(state, config)
    
    response = result["messages"][-1].content.lower()
    
    # Should either indicate order not found OR escalate to human
    # (escalation is acceptable when we can't find the order)
    order_not_found = any(phrase in response for phrase in ["not found", "couldn't find", "unable to locate", "check", "verify"])
    escalated = result.get("needs_escalation", False) or "human" in response or "agent" in response
    
    assert order_not_found or escalated, \
        f"Response should indicate order not found or escalate: {response}"


@pytest.mark.asyncio
async def test_complaint_creates_ticket():
    """Complaints should create a support ticket."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-complaint-ticket", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="My laptop arrived completely broken! This is unacceptable!")]
    
    config = {"configurable": {"thread_id": "test-complaint-ticket"}}
    result = await agent.ainvoke(state, config)
    
    # Should have created a ticket
    assert result.get("ticket_id") is not None, "Complaint should create a ticket"
    assert result["ticket_id"].startswith("TKT-"), f"Invalid ticket ID: {result.get('ticket_id')}"


@pytest.mark.asyncio
async def test_handoff_request_triggers_escalation():
    """Explicit handoff requests should trigger escalation."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-explicit-handoff", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="I want to speak to a human agent please")]
    
    config = {"configurable": {"thread_id": "test-explicit-handoff"}}
    result = await agent.ainvoke(state, config)
    
    # Should be escalated
    assert result.get("needs_escalation") is True, "Handoff request should trigger escalation"
    assert result.get("ticket_id") is not None, "Handoff should create a ticket"
    
    # Response should acknowledge the handoff
    response = result["messages"][-1].content.lower()
    assert any(word in response for word in ["human", "agent", "team", "reach"]), \
        f"Response should mention human handoff: {response}"


### tests/integration/test_edge_cases.py

In [None]:
"""Integration tests for edge cases and unusual inputs."""

import pytest
from langchain_core.messages import HumanMessage

from caspar.agent import create_agent, create_initial_state


@pytest.mark.asyncio
async def test_empty_message():
    """Should handle empty messages gracefully."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-empty", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="")]
    
    config = {"configurable": {"thread_id": "test-empty"}}
    
    # Should not crash
    result = await agent.ainvoke(state, config)
    assert result is not None
    assert len(result["messages"]) > 0


@pytest.mark.asyncio
async def test_very_long_message():
    """Should handle very long messages."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-long", customer_id="CUST-1000")
    
    # Create a long message
    long_message = "I have a question about my order. " * 100
    state["messages"] = [HumanMessage(content=long_message)]
    
    config = {"configurable": {"thread_id": "test-long"}}
    
    result = await agent.ainvoke(state, config)
    assert result is not None


@pytest.mark.asyncio
async def test_special_characters():
    """Should handle special characters in messages."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-special", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="What's the status of order #TF-10000? 🤔 <test> & more")]
    
    config = {"configurable": {"thread_id": "test-special"}}
    
    result = await agent.ainvoke(state, config)
    assert result is not None


@pytest.mark.asyncio
async def test_multiple_questions_in_one_message():
    """Should handle multiple questions in a single message."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-multi-q", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(
        content="Hi! Quick questions: What's your return policy? And do you offer warranties? Thanks!"
    )]
    
    config = {"configurable": {"thread_id": "test-multi-q"}}
    
    result = await agent.ainvoke(state, config)
    response = result["messages"][-1].content.lower()
    
    # Should address at least some of the questions OR escalate for complex request
    addressed_topics = any(word in response for word in ["return", "warranty", "policy", "day"])
    escalated = result.get("needs_escalation", False) or "human" in response or "agent" in response
    
    assert addressed_topics or escalated, \
        f"Should address topics or escalate complex request: {response[:200]}"


@pytest.mark.asyncio
async def test_all_caps_message():
    """Should handle ALL CAPS messages (often indicate frustration)."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-caps", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="WHERE IS MY ORDER THIS IS TAKING TOO LONG")]
    
    config = {"configurable": {"thread_id": "test-caps"}}
    
    result = await agent.ainvoke(state, config)
    
    # Should recognize as order inquiry or complaint
    assert result["intent"] in ["order_inquiry", "complaint"], \
        f"ALL CAPS should be recognized as order/complaint: {result['intent']}"


@pytest.mark.asyncio
async def test_greeting_only():
    """Should handle simple greetings."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-greeting", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="Hello!")]
    
    config = {"configurable": {"thread_id": "test-greeting"}}
    
    result = await agent.ainvoke(state, config)
    response = result["messages"][-1].content.lower()
    
    # Should respond with a greeting
    assert any(word in response for word in ["hello", "hi", "help", "welcome"]), \
        f"Should greet the customer: {response}"


@pytest.mark.asyncio
async def test_typos_and_misspellings():
    """Should handle messages with typos."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="test-typos", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="waht is ur retrun polcy?")]
    
    config = {"configurable": {"thread_id": "test-typos"}}
    
    result = await agent.ainvoke(state, config)
    
    # Should still classify as FAQ about returns
    assert result["intent"] == "faq", f"Should understand despite typos: {result['intent']}"


### tests/evaluation/evaluator.py

In [None]:
"""
Response quality evaluation framework.

Uses LLM-as-a-judge to assess agent responses.
"""

from pydantic import BaseModel
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

from caspar.config import settings


class EvaluationResult(BaseModel):
    """Result of evaluating a response."""
    
    relevance_score: float  # 0-1: How relevant is the response?
    accuracy_score: float   # 0-1: Is the information correct?
    helpfulness_score: float  # 0-1: Does it help the customer?
    tone_score: float       # 0-1: Is the tone appropriate?
    overall_score: float    # 0-1: Overall quality
    feedback: str           # Explanation of scores


class ResponseEvaluator:
    """Evaluates agent responses using LLM-as-a-judge."""
    
    def __init__(self):
        self.llm = ChatOpenAI(
            model=settings.smart_model,  # Use better model for evaluation
            api_key=settings.openai_api_key,
            temperature=0,
        )
    
    def evaluate(
        self,
        customer_message: str,
        agent_response: str,
        expected_topics: list[str] | None = None,
        context: str | None = None,
    ) -> EvaluationResult:
        """
        Evaluate an agent response.
        
        Args:
            customer_message: What the customer asked
            agent_response: What the agent responded
            expected_topics: Topics that should be covered
            context: Additional context (e.g., order info, policies)
        """
        eval_prompt = f"""You are evaluating a customer service AI agent's response.

Customer Message: "{customer_message}"

Agent Response: "{agent_response}"

{f'Expected Topics: {", ".join(expected_topics)}' if expected_topics else ''}
{f'Context: {context}' if context else ''}

Evaluate the response on these criteria (0.0 to 1.0):

1. RELEVANCE: Does the response address what the customer asked?
2. ACCURACY: Is the information provided correct and not hallucinated?
3. HELPFULNESS: Does the response help solve the customer's problem?
4. TONE: Is the tone professional, friendly, and appropriate?

Respond in this exact format:
RELEVANCE: [score]
ACCURACY: [score]
HELPFULNESS: [score]
TONE: [score]
FEEDBACK: [1-2 sentence explanation]"""

        response = self.llm.invoke([HumanMessage(content=eval_prompt)])
        
        # Parse response
        scores = {"relevance": 0.5, "accuracy": 0.5, "helpfulness": 0.5, "tone": 0.5}
        feedback = "Unable to parse evaluation"
        
        for line in response.content.strip().split("\n"):
            line = line.strip()
            if line.startswith("RELEVANCE:"):
                scores["relevance"] = self._parse_score(line)
            elif line.startswith("ACCURACY:"):
                scores["accuracy"] = self._parse_score(line)
            elif line.startswith("HELPFULNESS:"):
                scores["helpfulness"] = self._parse_score(line)
            elif line.startswith("TONE:"):
                scores["tone"] = self._parse_score(line)
            elif line.startswith("FEEDBACK:"):
                feedback = line.split(":", 1)[1].strip()
        
        overall = sum(scores.values()) / len(scores)
        
        return EvaluationResult(
            relevance_score=scores["relevance"],
            accuracy_score=scores["accuracy"],
            helpfulness_score=scores["helpfulness"],
            tone_score=scores["tone"],
            overall_score=overall,
            feedback=feedback,
        )
    
    def _parse_score(self, line: str) -> float:
        """Parse a score from an evaluation line."""
        try:
            score_str = line.split(":")[1].strip()
            score = float(score_str)
            return max(0.0, min(1.0, score))
        except (IndexError, ValueError):
            return 0.5


### tests/evaluation/test_response_quality.py

In [None]:
"""Evaluation tests for response quality."""

import pytest
from langchain_core.messages import HumanMessage

from caspar.agent import create_agent, create_initial_state
from .evaluator import ResponseEvaluator


@pytest.fixture
def evaluator():
    """Create a response evaluator."""
    return ResponseEvaluator()


@pytest.mark.asyncio
async def test_faq_response_quality(evaluator):
    """FAQ responses should be high quality."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="eval-faq", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="What is your return policy?")]
    
    config = {"configurable": {"thread_id": "eval-faq"}}
    result = await agent.ainvoke(state, config)
    
    response = result["messages"][-1].content
    
    evaluation = evaluator.evaluate(
        customer_message="What is your return policy?",
        agent_response=response,
        expected_topics=["return", "30 days", "refund"],
        context="TechFlow has a 30-day return policy for most items.",
    )
    
    assert evaluation.relevance_score >= 0.7, f"Low relevance: {evaluation.feedback}"
    assert evaluation.overall_score >= 0.7, f"Low overall score: {evaluation.feedback}"


@pytest.mark.asyncio
async def test_complaint_response_quality(evaluator):
    """Complaint responses should be empathetic and helpful."""
    agent = await create_agent()
    state = create_initial_state(conversation_id="eval-complaint", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="My laptop arrived damaged! I'm very upset!")]
    
    config = {"configurable": {"thread_id": "eval-complaint"}}
    result = await agent.ainvoke(state, config)
    
    response = result["messages"][-1].content
    
    evaluation = evaluator.evaluate(
        customer_message="My laptop arrived damaged! I'm very upset!",
        agent_response=response,
        expected_topics=["apologize", "understand", "help", "ticket"],
    )
    
    assert evaluation.tone_score >= 0.7, f"Tone not empathetic enough: {evaluation.feedback}"
    assert evaluation.helpfulness_score >= 0.6, f"Not helpful enough: {evaluation.feedback}"


@pytest.mark.asyncio
async def test_order_status_response_accuracy(evaluator):
    """Order status responses should be accurate."""
    agent = await create_agent()
    # Use CUST-1000 with TF-10000 (matching ownership)
    state = create_initial_state(conversation_id="eval-order", customer_id="CUST-1000")
    state["messages"] = [HumanMessage(content="Where is my order TF-10000?")]
    
    config = {"configurable": {"thread_id": "eval-order"}}
    result = await agent.ainvoke(state, config)
    
    response = result["messages"][-1].content
    
    evaluation = evaluator.evaluate(
        customer_message="Where is my order TF-10000?",
        agent_response=response,
        expected_topics=["order", "status", "TF-10000"],
    )
    
    assert evaluation.accuracy_score >= 0.7, f"Inaccurate response: {evaluation.feedback}"


### tests/evaluation/test_dataset.py

In [None]:
"""
Test dataset for systematic evaluation.

This dataset covers various scenarios the agent should handle.
"""

TEST_CASES = [
    # FAQ Questions
    {
        "category": "faq",
        "input": "What is your return policy?",
        "expected_intent": "faq",
        "expected_topics": ["return", "30 days"],
        "min_quality_score": 0.7,
    },
    {
        "category": "faq",
        "input": "How long does shipping take?",
        "expected_intent": "faq",
        "expected_topics": ["shipping", "days", "delivery"],
        "min_quality_score": 0.7,
    },
    {
        "category": "faq",
        "input": "Do you offer warranties on laptops?",
        "expected_intent": "faq",
        "expected_topics": ["warranty", "year"],
        "min_quality_score": 0.7,
    },
    
    # Order Inquiries
    {
        "category": "order",
        "input": "Where is my order TF-10000?",
        "expected_intent": "order_inquiry",
        "expected_topics": ["order", "status"],
        "min_quality_score": 0.7,
    },
    {
        "category": "order",
        "input": "I want to track my package",
        "expected_intent": "order_inquiry",
        "expected_topics": ["track", "order"],
        "min_quality_score": 0.6,
    },
    
    # Complaints
    {
        "category": "complaint",
        "input": "This product is defective and I want a refund!",
        "expected_intent": "complaint",
        "expected_topics": ["sorry", "help", "refund"],
        "min_quality_score": 0.7,
        "expect_ticket": True,
    },
    {
        "category": "complaint",
        "input": "I've been waiting 3 weeks for my order. This is ridiculous!",
        "expected_intent": "complaint",
        "expected_topics": ["apologize", "order", "help"],
        "min_quality_score": 0.7,
    },
    
    # Handoff Requests
    {
        "category": "handoff",
        "input": "I want to speak to a human agent",
        "expected_intent": "handoff_request",
        "expected_topics": ["human", "agent", "help"],
        "min_quality_score": 0.7,
        "expect_escalation": True,
    },
    
    # Edge Cases
    {
        "category": "edge",
        "input": "Hi",
        "expected_intent": "general",
        "expected_topics": ["hello", "help"],
        "min_quality_score": 0.6,
    },
    {
        "category": "edge",
        "input": "",
        "expected_intent": "general",
        "min_quality_score": 0.0,  # Empty input, just shouldn't crash
    },
]


### scripts/run_tests.py

In [None]:
"""
Unified test runner for CASPAR.

Run different test suites based on the situation.
"""

import subprocess
import sys
import argparse


def run_unit_tests():
    """Run fast unit tests."""
    print("\n🧪 Running Unit Tests...")
    result = subprocess.run(
        ["pytest", "tests/unit/", "-v", "--tb=short"],
        capture_output=False
    )
    return result.returncode == 0


def run_integration_tests():
    """Run integration tests (requires API key)."""
    print("\n🔗 Running Integration Tests...")
    result = subprocess.run(
        ["pytest", "tests/integration/", "-v", "--tb=short", "--timeout=60"],
        capture_output=False
    )
    return result.returncode == 0


def run_evaluation_tests():
    """Run evaluation tests (slowest, most thorough)."""
    print("\n📊 Running Evaluation Tests...")
    result = subprocess.run(
        ["pytest", "tests/evaluation/", "-v", "--tb=short", "--timeout=120"],
        capture_output=False
    )
    return result.returncode == 0


def run_all_tests():
    """Run all test suites."""
    results = {
        "unit": run_unit_tests(),
        "integration": run_integration_tests(),
        "evaluation": run_evaluation_tests(),
    }
    
    print("\n" + "=" * 60)
    print("TEST SUMMARY")
    print("=" * 60)
    
    for suite, passed in results.items():
        status = "✅ PASSED" if passed else "❌ FAILED"
        print(f"  {suite.capitalize()}: {status}")
    
    all_passed = all(results.values())
    print(f"\nOverall: {'✅ ALL PASSED' if all_passed else '❌ SOME FAILED'}")
    
    return all_passed


def main():
    parser = argparse.ArgumentParser(description="Run CASPAR tests")
    parser.add_argument(
        "--suite",
        choices=["unit", "integration", "evaluation", "all"],
        default="unit",
        help="Which test suite to run (default: unit)"
    )
    parser.add_argument(
        "--quick",
        action="store_true",
        help="Run only unit tests (fastest)"
    )
    
    args = parser.parse_args()
    
    if args.quick:
        success = run_unit_tests()
    elif args.suite == "unit":
        success = run_unit_tests()
    elif args.suite == "integration":
        success = run_integration_tests()
    elif args.suite == "evaluation":
        success = run_evaluation_tests()
    else:
        success = run_all_tests()
    
    sys.exit(0 if success else 1)


if __name__ == "__main__":
    main()


---
## Section 20.7: Deployment and next steps

This section covers deploying CASPAR with Docker and FastAPI.

**Key files:**
- `Dockerfile` - Container build configuration
- `src/caspar/api/main.py` - FastAPI application
- `src/caspar/api/metrics.py` - Monitoring metrics
- `src/caspar/config/logging_config.py` - Production logging

### Dockerfile

In [None]:
# Save as: Dockerfile

# Use Python 3.13 slim image
FROM python:3.13-slim

# Set working directory
WORKDIR /app

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PORT=8000

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements first (for caching)
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy package configuration and source code
COPY pyproject.toml .
COPY src/ ./src/
COPY data/ ./data/
COPY scripts/ ./scripts/

# Install the caspar package
# This makes 'from caspar.agent import ...' work properly
RUN pip install --no-cache-dir -e .

# Create non-root user for security
RUN useradd --create-home appuser && chown -R appuser:appuser /app
USER appuser

# Expose the API port (documentation only, Railway ignores this)
EXPOSE 8000

# Health check for local Docker (Railway uses its own)
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
    CMD curl -f http://localhost:${PORT}/health || exit 1

# Run the API server
# IMPORTANT: Use $PORT for Railway compatibility (they set this env var)
CMD uvicorn caspar.api.main:app --host 0.0.0.0 --port ${PORT}


### api/main.py

In [None]:
# Save as: src/caspar/api/main.py

"""
CASPAR API - Production-ready FastAPI application.

Provides REST endpoints for the customer service agent.
"""

from contextlib import asynccontextmanager
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel, Field
import uuid

from caspar.config import settings, get_logger
from caspar.agent import create_checkpointer_context, create_agent, create_initial_state
from caspar.knowledge import get_retriever

logger = get_logger(__name__)

# Store active conversations in memory
# Note: For horizontal scaling, use Redis instead
conversations: dict = {}

# The agent instance (initialized on startup)
agent = None


# ─────────────────────────────────────────────────────────────
# LIFESPAN - STARTUP AND SHUTDOWN
# ─────────────────────────────────────────────────────────────

@asynccontextmanager
async def lifespan(app: FastAPI):
    """
    Initialize resources on startup, cleanup on shutdown.
    
    The checkpointer context manager MUST wrap the yield to keep
    the database connection open during the server's lifetime.
    """
    global agent
    
    logger.info("starting_caspar_api", version="1.0.0")
    
    # Step 1: Initialize knowledge base (validates it's ready)
    retriever = get_retriever()
    logger.info("knowledge_base_ready")
    
    # Step 2: Create checkpointer context
    # The 'async with' keeps the database connection open
    async with create_checkpointer_context() as checkpointer:
        
        # Step 3: Create the agent with the checkpointer
        agent = await create_agent(checkpointer=checkpointer)
        logger.info(
            "agent_initialized",
            persistence_enabled=checkpointer is not None
        )
        
        # Step 4: Server runs here (while inside the 'async with')
        yield
        
        # Step 5: Cleanup on shutdown
        logger.info("shutting_down_caspar_api")
        conversations.clear()
    
    # Database connection closes automatically when we exit 'async with'


# ─────────────────────────────────────────────────────────────
# FASTAPI APP SETUP
# ─────────────────────────────────────────────────────────────

app = FastAPI(
    title="CASPAR API",
    description="Customer Service AI Agent powered by LangGraph",
    version="1.0.0",
    lifespan=lifespan,  # Connect our startup/shutdown logic
)

# CORS middleware allows web browsers to call our API
# Without this, browsers block requests from different domains
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],  # In production, list specific domains
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)


# ─────────────────────────────────────────────────────────────
# REQUEST MODELS (what clients send to us)
# ─────────────────────────────────────────────────────────────

class StartConversationRequest(BaseModel):
    """Request to start a new conversation."""
    customer_id: str = Field(..., description="Customer identifier")
    initial_message: str | None = Field(None, description="Optional first message")


class SendMessageRequest(BaseModel):
    """Request to send a message in a conversation."""
    message: str = Field(..., min_length=1, max_length=10000)


# ─────────────────────────────────────────────────────────────
# RESPONSE MODELS (what we send back to clients)
# ─────────────────────────────────────────────────────────────

class StartConversationResponse(BaseModel):
    """Response with new conversation details."""
    conversation_id: str
    message: str


class SendMessageResponse(BaseModel):
    """Response from the agent."""
    response: str
    intent: str | None = None
    needs_escalation: bool = False
    ticket_id: str | None = None
    metadata: dict = Field(default_factory=dict)


class ConversationStatus(BaseModel):
    """Current status of a conversation."""
    conversation_id: str
    customer_id: str
    message_count: int
    intent: str | None
    needs_escalation: bool
    created_at: str


class HealthResponse(BaseModel):
    """Health check response."""
    status: str
    version: str
    agent_ready: bool


# ─────────────────────────────────────────────────────────────
# ENDPOINTS
# ─────────────────────────────────────────────────────────────

@app.get("/health", response_model=HealthResponse, tags=["System"])
async def health_check():
    """
    Check if the service is healthy.
    
    Used by:
    - Load balancers to know if this instance can receive traffic
    - Kubernetes/Docker health checks
    - Monitoring systems
    """
    return HealthResponse(
        status="healthy",
        version="1.0.0",
        agent_ready=agent is not None,
    )


@app.post("/conversations", response_model=StartConversationResponse, tags=["Conversations"])
async def start_conversation(request: StartConversationRequest):
    """
    Start a new conversation with CASPAR.
    
    Returns a conversation ID to use for subsequent messages.
    """
    # Generate a unique ID for this conversation
    conversation_id = f"conv-{uuid.uuid4().hex[:12]}"
    
    # Initialize the agent's state for this conversation
    state = create_initial_state(
        conversation_id=conversation_id,
        customer_id=request.customer_id,
    )
    
    # Store in memory (keyed by conversation_id)
    conversations[conversation_id] = {
        "state": state,
        "customer_id": request.customer_id,
    }
    
    logger.info(
        "conversation_started",
        conversation_id=conversation_id,
        customer_id=request.customer_id,
    )
    
    # If the client sent an initial message, process it immediately
    if request.initial_message:
        response = await _process_message(conversation_id, request.initial_message)
        return StartConversationResponse(
            conversation_id=conversation_id,
            message=response.response,
        )
    
    # Otherwise, return a greeting
    return StartConversationResponse(
        conversation_id=conversation_id,
        message="Hello! I'm CASPAR, your customer service assistant. How can I help you today?",
    )


@app.post(
    "/conversations/{conversation_id}/messages",
    response_model=SendMessageResponse,
    tags=["Conversations"],
)
async def send_message(conversation_id: str, request: SendMessageRequest):
    """
    Send a message in an existing conversation.
    
    The agent will process the message and return a response.
    """
    # Check if conversation exists
    if conversation_id not in conversations:
        raise HTTPException(
            status_code=404,
            detail=f"Conversation {conversation_id} not found",
        )
    
    # Process the message (see helper function below)
    return await _process_message(conversation_id, request.message)


@app.get(
    "/conversations/{conversation_id}",
    response_model=ConversationStatus,
    tags=["Conversations"],
)
async def get_conversation(conversation_id: str):
    """Get the current status of a conversation."""
    if conversation_id not in conversations:
        raise HTTPException(
            status_code=404,
            detail=f"Conversation {conversation_id} not found",
        )
    
    conv = conversations[conversation_id]
    state = conv["state"]
    
    return ConversationStatus(
        conversation_id=conversation_id,
        customer_id=conv["customer_id"],
        message_count=len(state.get("messages", [])),
        intent=state.get("intent"),
        needs_escalation=state.get("needs_escalation", False),
        created_at=state.get("started_at", "unknown"),
    )


@app.delete("/conversations/{conversation_id}", tags=["Conversations"])
async def end_conversation(conversation_id: str):
    """End a conversation and clean up resources."""
    if conversation_id not in conversations:
        raise HTTPException(
            status_code=404,
            detail=f"Conversation {conversation_id} not found",
        )
    
    del conversations[conversation_id]
    
    logger.info("conversation_ended", conversation_id=conversation_id)
    
    return {"status": "ended", "conversation_id": conversation_id}


# ─────────────────────────────────────────────────────────────
# METRICS ENDPOINT
# ─────────────────────────────────────────────────────────────

from caspar.api.metrics import metrics

@app.get("/metrics", tags=["System"])
async def get_metrics():
    """
    Get current metrics.
    
    Returns counters, latencies, and uptime.
    Useful for monitoring dashboards.
    """
    return metrics.get_stats()


# ─────────────────────────────────────────────────────────────
# HELPER FUNCTIONS
# ─────────────────────────────────────────────────────────────

async def _process_message(conversation_id: str, message: str) -> SendMessageResponse:
    """
    Process a message through the agent.
    
    This is a helper function used by multiple endpoints.
    The underscore prefix indicates it's private (not an endpoint).
    """
    from langchain_core.messages import HumanMessage
    
    # Get the conversation from memory
    conv = conversations[conversation_id]
    state = conv["state"]
    
    # Add the user's message to the conversation history
    state["messages"].append(HumanMessage(content=message))
    
    # Configure the agent with this conversation's thread_id
    # This enables persistence (if checkpointer is available)
    config = {"configurable": {"thread_id": conversation_id}}
    
    try:
        # ═══════════════════════════════════════════════════════
        # THIS IS THE KEY LINE - Run the LangGraph agent!
        # ═══════════════════════════════════════════════════════
        result = await agent.ainvoke(state, config)
        
        # Update stored state with the result
        conv["state"] = result
        
        # Extract the AI's response (last message)
        ai_response = result["messages"][-1].content if result["messages"] else \
            "I apologize, but I couldn't process your request."
        
        logger.info(
            "message_processed",
            conversation_id=conversation_id,
            intent=result.get("intent"),
            needs_escalation=result.get("needs_escalation", False),
        )
        
        # Build and return the response
        return SendMessageResponse(
            response=ai_response,
            intent=result.get("intent"),
            needs_escalation=result.get("needs_escalation", False),
            ticket_id=result.get("ticket_id"),
            metadata={
                "sentiment_score": result.get("sentiment_score"),
                "frustration_level": result.get("frustration_level"),
            },
        )
        
    except Exception as e:
        logger.error(
            "message_processing_error", 
            error=str(e), 
            conversation_id=conversation_id
        )
        raise HTTPException(
            status_code=500,
            detail="An error occurred processing your message. Please try again.",
        )


### api/metrics.py

In [None]:
# Save as: src/caspar/api/metrics.py

"""
Simple metrics tracking for CASPAR.

In production, you'd use Prometheus, DataDog, or similar.
This simple implementation demonstrates the concepts.
"""

from datetime import datetime, timezone
from collections import defaultdict
import threading


class SimpleMetrics:
    """
    Thread-safe metrics collector.
    
    Why thread-safe? FastAPI handles multiple requests at once.
    Without locks, concurrent updates could corrupt our data.
    """
    
    def __init__(self):
        # Lock prevents race conditions when multiple requests update metrics
        self._lock = threading.Lock()
        
        # Counters track "how many times did X happen?"
        # defaultdict(int) means missing keys default to 0
        self._counters = defaultdict(int)
        
        # Latencies track "how long did X take?"
        # We store lists of measurements for each operation
        self._latencies = defaultdict(list)
        
        # Track when we started (for uptime calculation)
        self._started_at = datetime.now(timezone.utc)
    
    def increment(self, name: str, value: int = 1):
        """
        Increment a counter.
        
        Usage:
            metrics.increment("conversations_started")
            metrics.increment("messages_processed")
            metrics.increment("errors", 1)
        """
        with self._lock:  # Acquire lock before modifying
            self._counters[name] += value
        # Lock automatically released when we exit 'with' block
    
    def record_latency(self, name: str, seconds: float):
        """
        Record how long an operation took.
        
        Usage:
            start = time.time()
            do_something()
            metrics.record_latency("llm_call", time.time() - start)
        """
        with self._lock:
            self._latencies[name].append(seconds)
            
            # Keep only last 1000 measurements to prevent memory bloat
            # Older measurements "fall off" as new ones come in
            if len(self._latencies[name]) > 1000:
                self._latencies[name] = self._latencies[name][-1000:]
    
    def get_stats(self) -> dict:
        """
        Get current statistics.
        
        Returns a dictionary that can be serialized to JSON.
        """
        with self._lock:
            stats = {
                # How long has the server been running?
                "uptime_seconds": (
                    datetime.now(timezone.utc) - self._started_at
                ).total_seconds(),
                
                # All counter values
                "counters": dict(self._counters),
                
                # Latency statistics (calculated below)
                "latencies": {},
            }
            
            # Calculate latency statistics for each tracked operation
            for name, values in self._latencies.items():
                if values:
                    stats["latencies"][name] = {
                        "count": len(values),
                        "avg_ms": sum(values) * 1000 / len(values),
                        "max_ms": max(values) * 1000,
                        "min_ms": min(values) * 1000,
                    }
            
            return stats


# Create a single global instance
# All parts of the app use this same instance
metrics = SimpleMetrics()


def track_latency(name: str):
    """
    Decorator to automatically track function execution time.
    
    Usage:
        @track_latency("llm_call")
        async def call_llm():
            ...
    
    This will:
    - Track how long the function takes
    - Increment {name}_success on success
    - Increment {name}_error on failure
    """
    def decorator(func):
        async def wrapper(*args, **kwargs):
            start = datetime.now(timezone.utc)
            try:
                result = await func(*args, **kwargs)
                metrics.increment(f"{name}_success")
                return result
            except Exception as e:
                metrics.increment(f"{name}_error")
                raise  # Re-raise the exception
            finally:
                # 'finally' runs whether success or failure
                elapsed = (datetime.now(timezone.utc) - start).total_seconds()
                metrics.record_latency(name, elapsed)
        return wrapper
    return decorator


### config/logging_config.py

In [None]:
# Save as: src/caspar/config/logging_config.py

"""
Production logging configuration.

Outputs JSON logs that can be shipped to any log aggregator.
"""

import structlog
import logging
import sys


def configure_production_logging():
    """
    Configure logging for production.
    
    Call this once at application startup.
    """
    
    # Configure structlog to output JSON
    structlog.configure(
        processors=[
            # Include context variables (set elsewhere in code)
            structlog.contextvars.merge_contextvars,
            
            # Add log level (info, warning, error, etc.)
            structlog.processors.add_log_level,
            
            # Include stack traces for errors
            structlog.processors.StackInfoRenderer(),
            
            # Add ISO-format timestamp
            structlog.processors.TimeStamper(fmt="iso"),
            
            # Output as JSON (the key part!)
            structlog.processors.JSONRenderer(),
        ],
        
        # Only log INFO and above (not DEBUG)
        wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
        
        # Use dict for context
        context_class=dict,
        
        # Output to stdout (Docker captures this)
        logger_factory=structlog.PrintLoggerFactory(),
        
        # Cache logger for performance
        cache_logger_on_first_use=True,
    )
    
    # Also configure Python's standard logging library
    # (some libraries use this instead of structlog)
    logging.basicConfig(
        format="%(message)s",
        stream=sys.stdout,
        level=logging.INFO,
    )


def configure_development_logging():
    """
    Configure logging for development.
    
    Uses human-readable output instead of JSON.
    """
    structlog.configure(
        processors=[
            structlog.contextvars.merge_contextvars,
            structlog.processors.add_log_level,
            structlog.processors.StackInfoRenderer(),
            structlog.processors.TimeStamper(fmt="iso"),
            # Human-readable output with colors
            structlog.dev.ConsoleRenderer(colors=True),
        ],
        wrapper_class=structlog.make_filtering_bound_logger(logging.DEBUG),
        context_class=dict,
        logger_factory=structlog.PrintLoggerFactory(),
        cache_logger_on_first_use=True,
    )
    
    logging.basicConfig(
        format="%(message)s",
        stream=sys.stdout,
        level=logging.DEBUG,
    )


---
## Running CASPAR

To run the complete CASPAR system:

### 1. Setup
```bash
cd caspar
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -e .
```

### 2. Start PostgreSQL
```bash
docker compose up -d
```

### 3. Configure Environment
```bash
cp .env.example .env
# Edit .env with your OpenAI API key
```

### 4. Build Knowledge Base
```bash
python scripts/build_knowledge_base.py
```

### 5. Run Tests
```bash
python scripts/run_tests.py
```

### 6. Start the API
```bash
uvicorn caspar.api.main:app --reload
```

### 7. Test Conversation
```bash
python scripts/test_conversation_flow.py
```

---
## Congratulations!

You've completed the Zero to AI Agent book! You now have the skills to build production-ready AI agents using:
- Python fundamentals
- LLM integration with OpenAI
- Agent frameworks (LangChain, LangGraph)
- RAG for knowledge retrieval
- Human-in-the-loop workflows
- Testing and evaluation
- Production deployment

**Next steps:**
- Customize CASPAR for your own use case
- Explore more advanced LangGraph patterns
- Deploy to a cloud platform (Railway, AWS, GCP)
- Build your own agents!