# CrewAI Specialty Level: Custom Tools and Production Patterns

## Building Production-Ready AI Systems

This notebook covers advanced topics for building robust, production-ready CrewAI applications.

### What You'll Learn:
1. Creating custom tools
2. Memory and context management
3. Callbacks and monitoring
4. Error handling patterns
5. Structured outputs with Pydantic
6. Best practices for production

### Prerequisites:
- Completed all previous notebooks
- Understanding of Flows, Crews, Agents, and Tasks

---

## Setup

In [1]:
# Install packages
%pip install crewai crewai-tools python-dotenv --quiet

In [2]:
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# ============================================
# GLOBAL CONFIGURATION
# ============================================
MODEL = "gpt-5-mini"

# Core imports
from crewai import Agent, Task, Crew, Process, LLM
from crewai.flow.flow import Flow, listen, start, router
from crewai.tools import BaseTool
from pydantic import BaseModel, Field
from typing import Type, Optional, List, Any
import json

# Initialize LLM
llm = LLM(model=f"openai/{MODEL}", temperature=0.7)

print(f"Environment loaded. Using model: {MODEL}")

Environment loaded. Using model: gpt-4o-mini


## Part 1: Creating Custom Tools

Custom tools extend agent capabilities. You can create tools that connect to APIs, databases, or any external service.

In [3]:
# Method 1: Create a tool using BaseTool class

class CalculatorToolInput(BaseModel):
    """Input schema for the calculator tool"""
    operation: str = Field(..., description="The operation: add, subtract, multiply, divide")
    a: float = Field(..., description="First number")
    b: float = Field(..., description="Second number")

class CalculatorTool(BaseTool):
    """A simple calculator tool for agents"""
    name: str = "Calculator"
    description: str = "Performs basic math operations: add, subtract, multiply, divide"
    args_schema: Type[BaseModel] = CalculatorToolInput
    
    def _run(self, operation: str, a: float, b: float) -> str:
        """Execute the calculation"""
        operations = {
            "add": lambda x, y: x + y,
            "subtract": lambda x, y: x - y,
            "multiply": lambda x, y: x * y,
            "divide": lambda x, y: x / y if y != 0 else "Error: Division by zero"
        }
        
        if operation not in operations:
            return f"Error: Unknown operation '{operation}'. Use: add, subtract, multiply, divide"
        
        result = operations[operation](a, b)
        return f"{a} {operation} {b} = {result}"

# Create the tool instance
calculator = CalculatorTool()

# Test the tool
print("Testing Calculator Tool:")
print(calculator._run("add", 10, 5))
print(calculator._run("multiply", 7, 8))
print(calculator._run("divide", 100, 4))

Testing Calculator Tool:
10 add 5 = 15
7 multiply 8 = 56
100 divide 4 = 25.0


In [6]:
# Method 2: Create tool using @tool decorator (simpler approach)
from crewai.tools import tool

@tool("Database Query Tool")
def database_query(query: str) -> str:
    """
    Simulates querying a database. In production, connect to actual database.
    
    Args:
        query: SQL-like query string
    
    Returns:
        Query results as JSON string
    """
    # Simulated database
    mock_data = {
        "users": [
            {"id": 1, "name": "Alice", "role": "admin"},
            {"id": 2, "name": "Bob", "role": "user"},
            {"id": 3, "name": "Charlie", "role": "user"}
        ],
        "products": [
            {"id": 101, "name": "Widget", "price": 9.99},
            {"id": 102, "name": "Gadget", "price": 19.99}
        ]
    }
    
    # Simple query parsing
    if "users" in query.lower():
        return json.dumps(mock_data["users"], indent=2)
    elif "products" in query.lower():
        return json.dumps(mock_data["products"], indent=2)
    else:
        return "No results found"

# Test the decorator-based tool
print("Testing Database Query Tool:")
print(database_query("SELECT * FROM users"))

Testing Database Query Tool:
[
  {
    "id": 1,
    "name": "Alice",
    "role": "admin"
  },
  {
    "id": 2,
    "name": "Bob",
    "role": "user"
  },
  {
    "id": 3,
    "name": "Charlie",
    "role": "user"
  }
]


In [8]:
# Using custom tools with an Agent
data_analyst = Agent(
    role="Data Analyst",
    goal="Analyze data and perform calculations to provide insights",
    backstory="""You are a skilled data analyst who uses tools to query 
    databases and perform calculations. You always verify your results.""",
    llm=llm,
    tools=[calculator, database_query],  # Assign custom tools
    verbose=True
)

# Create a task that uses the tools
analysis_task = Task(
    description="""Analyze the user database:
    1. Query the users table
    2. Calculate the average number of users per role
    3. Provide a summary of findings""",
    expected_output="A detailed analysis with statistics",
    agent=data_analyst
)

# Create and run crew
analysis_crew = Crew(
    agents=[data_analyst],
    tasks=[analysis_task],
    verbose=True
)

print("Crew with custom tools ready!")
# Uncomment to run:
# result = analysis_crew.kickoff()
# print(result)

Crew with custom tools ready!


[32mTool database_query_tool executed with result: [
  {
    "id": 1,
    "name": "Alice",
    "role": "admin"
  },
  {
    "id": 2,
    "name": "Bob",
    "role": "user"
  },
  {
    "id": 3,
    "name": "Charlie",
    "role": "user"
  }
]...[0m


[32mTool calculator executed with result: 3 divide 2 = 1.5...[0m


**User Database Analysis**

1. **Query Results:**
   - Total Users: 3
   - Users by Role:
     - Admin: 1
     - User: 2

2. **Average Number of Users per Role:**
   - Total Roles: 2 (Admin, User)
   - Average Users per Role: 1.5

3. **Summary of Findings:**
   - The user database consists of 3 users, categorized into 2 distinct roles.
   - The distribution of users is uneven, with 1 admin and 2 users.
   - On average, there are 1.5 users for each role in the database.


## Part 2: Structured Outputs with Pydantic

Use Pydantic models to ensure agents return well-structured, validated outputs.

In [9]:
# Define structured output models
class ArticleSummary(BaseModel):
    """Structured output for article summaries"""
    title: str = Field(..., description="Article title")
    summary: str = Field(..., description="Brief summary (2-3 sentences)")
    key_points: List[str] = Field(..., description="List of key points")
    sentiment: str = Field(..., description="Overall sentiment: positive, negative, neutral")
    word_count: int = Field(..., description="Word count of original content")

class ResearchReport(BaseModel):
    """Structured output for research reports"""
    topic: str = Field(..., description="Research topic")
    findings: List[str] = Field(..., description="Key research findings")
    sources: List[str] = Field(..., description="Sources referenced")
    confidence_score: float = Field(..., ge=0, le=1, description="Confidence in findings (0-1)")
    recommendations: List[str] = Field(..., description="Actionable recommendations")

print("Structured output models defined!")
print(f"ArticleSummary fields: {list(ArticleSummary.model_fields.keys())}")
print(f"ResearchReport fields: {list(ResearchReport.model_fields.keys())}")

Structured output models defined!
ArticleSummary fields: ['title', 'summary', 'key_points', 'sentiment', 'word_count']
ResearchReport fields: ['topic', 'findings', 'sources', 'confidence_score', 'recommendations']


In [12]:
# Create task with structured output
summarizer = Agent(
    role="Content Summarizer",
    goal="Create structured summaries of content",
    backstory="Expert at distilling information into structured formats",
    llm=llm,
    verbose=True
)

# Task with Pydantic output type
summary_task = Task(
    description="""Summarize the following article about AI trends:
    
    "Artificial Intelligence is transforming industries worldwide. From healthcare 
    to finance, AI applications are improving efficiency and enabling new capabilities.
    Machine learning models are becoming more accurate, while natural language processing
    is enabling better human-computer interaction. However, challenges remain in areas
    like data privacy, bias, and regulatory compliance."
    
    Provide a structured summary with all required fields.""",
    expected_output="A complete ArticleSummary with all fields populated",
    agent=summarizer,
    output_pydantic=ArticleSummary  # Enforce structured output
)

# Create crew
summary_crew = Crew(
    agents=[summarizer],
    tasks=[summary_task],
    verbose=True
)

print("Crew with structured output ready!")
# Uncomment to run:
# result = summary_crew.kickoff()
# print(f"Type: {type(result)}")
# print(f"Title: {result.pydantic.title}")
# print(f"Key Points: {result.pydantic.key_points}")

Crew with structured output ready!


Type: <class 'crewai.crews.crew_output.CrewOutput'>
Title: AI Trends
Key Points: ['AI is transforming industries such as healthcare and finance.', 'AI applications are improving efficiency and enabling new capabilities.', 'Machine learning models are becoming more accurate.', 'Natural language processing is enhancing human-computer interaction.', 'Challenges include data privacy, bias, and regulatory compliance.']


## Part 3: Memory and Context

CrewAI supports different memory types to help agents retain and recall information.

In [15]:
# Memory types in CrewAI:
# 1. Short-term Memory: Conversation context within a session
# 2. Long-term Memory: Persists across sessions
# 3. Entity Memory: Tracks entities mentioned in conversations

# Create an agent with memory enabled
memory_agent = Agent(
    role="Customer Support Agent",
    goal="Provide helpful customer support while remembering context",
    backstory="""You are a friendly support agent who remembers customer 
    preferences and past interactions to provide personalized service.""",
    llm=llm,
    memory=True,  # Enable memory for this agent
    verbose=True
)

# Create a crew with memory configuration
memory_crew = Crew(
    agents=[memory_agent],
    tasks=[],  # Will add tasks dynamically
    memory=True,  # Enable crew-level memory
    verbose=True,
    # Memory configuration options:
    # embedder={
    #     "provider": "openai",
    #     "config": {"model": "text-embedding-3-small"}
    # }
)

print("Memory-enabled agent and crew created!")
print(f"Agent memory enabled: {memory_agent.memory}")
print(f"Crew memory enabled: {memory_crew.memory}")

Memory-enabled agent and crew created!
Crew memory enabled: True


## Part 4: Callbacks and Monitoring

Callbacks allow you to monitor and react to events during crew execution.

In [16]:
# Define callback functions
def task_callback(output):
    """Called when a task completes"""
    print(f"\n[CALLBACK] Task completed!")
    print(f"  Output length: {len(str(output))} characters")
    # In production: log to monitoring system, update database, etc.

def step_callback(step_output):
    """Called after each agent step"""
    print(f"\n[STEP] Agent step completed")
    # In production: track progress, measure latency, etc.

# Create agent with callbacks
monitored_agent = Agent(
    role="Monitored Agent",
    goal="Complete tasks while being monitored",
    backstory="An agent that demonstrates callback functionality",
    llm=llm,
    step_callback=step_callback,  # Called after each step
    verbose=True
)

# Create task with callback
monitored_task = Task(
    description="Write a short paragraph about AI monitoring",
    expected_output="A concise paragraph",
    agent=monitored_agent,
    callback=task_callback  # Called when task completes
)

print("Monitored agent and task created!")
print("Callbacks will trigger during execution.")

Monitored agent and task created!
Callbacks will trigger during execution.


## Part 5: Error Handling Patterns

Robust error handling is essential for production systems.

In [17]:
# Error handling wrapper for crew execution
import traceback
from datetime import datetime

class CrewExecutionResult(BaseModel):
    """Standardized result format for crew execution"""
    success: bool
    result: Optional[str] = None
    error: Optional[str] = None
    execution_time: float = 0.0
    timestamp: str = ""

def execute_crew_safely(crew: Crew, inputs: dict = None) -> CrewExecutionResult:
    """Execute a crew with comprehensive error handling"""
    start_time = datetime.now()
    
    try:
        # Execute the crew
        if inputs:
            result = crew.kickoff(inputs=inputs)
        else:
            result = crew.kickoff()
        
        execution_time = (datetime.now() - start_time).total_seconds()
        
        return CrewExecutionResult(
            success=True,
            result=str(result),
            execution_time=execution_time,
            timestamp=datetime.now().isoformat()
        )
        
    except Exception as e:
        execution_time = (datetime.now() - start_time).total_seconds()
        
        return CrewExecutionResult(
            success=False,
            error=f"{type(e).__name__}: {str(e)}",
            execution_time=execution_time,
            timestamp=datetime.now().isoformat()
        )

# Example usage
print("Error handling wrapper defined!")
print("Usage: result = execute_crew_safely(my_crew, inputs={'key': 'value'})")

Error handling wrapper defined!
Usage: result = execute_crew_safely(my_crew, inputs={'key': 'value'})


## Part 6: Production Best Practices

Key patterns for building production-ready CrewAI applications.

In [18]:
# Production Pattern 1: Configuration Management

class AgentConfig(BaseModel):
    """Centralized agent configuration"""
    role: str
    goal: str
    backstory: str
    temperature: float = 0.7
    max_iterations: int = 25
    verbose: bool = False

class CrewConfig(BaseModel):
    """Centralized crew configuration"""
    name: str
    description: str
    agents: List[AgentConfig]
    process: str = "sequential"
    memory: bool = False
    verbose: bool = False

# Load configuration from environment or config file
def create_agent_from_config(config: AgentConfig, llm: LLM) -> Agent:
    """Factory function to create agents from configuration"""
    return Agent(
        role=config.role,
        goal=config.goal,
        backstory=config.backstory,
        llm=llm,
        max_iter=config.max_iterations,
        verbose=config.verbose
    )

print("Configuration management pattern defined!")

Configuration management pattern defined!


In [19]:
# Production Pattern 2: Logging and Observability

import logging

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger('crewai_app')

class ObservableCrew:
    """Wrapper that adds observability to Crew execution"""
    
    def __init__(self, crew: Crew, name: str):
        self.crew = crew
        self.name = name
        self.execution_count = 0
        self.total_execution_time = 0.0
    
    def kickoff(self, inputs: dict = None):
        """Execute crew with logging and metrics"""
        self.execution_count += 1
        logger.info(f"Starting crew '{self.name}' - Execution #{self.execution_count}")
        
        start_time = datetime.now()
        
        try:
            if inputs:
                result = self.crew.kickoff(inputs=inputs)
            else:
                result = self.crew.kickoff()
            
            execution_time = (datetime.now() - start_time).total_seconds()
            self.total_execution_time += execution_time
            
            logger.info(f"Crew '{self.name}' completed in {execution_time:.2f}s")
            return result
            
        except Exception as e:
            logger.error(f"Crew '{self.name}' failed: {str(e)}")
            raise
    
    def get_metrics(self) -> dict:
        """Return execution metrics"""
        return {
            "name": self.name,
            "execution_count": self.execution_count,
            "total_execution_time": self.total_execution_time,
            "average_execution_time": self.total_execution_time / max(1, self.execution_count)
        }

print("Observable crew wrapper defined!")

Observable crew wrapper defined!


In [None]:
# Production Pattern 3: Rate Limiting and Retry Logic

import time
from functools import wraps

def retry_with_backoff(max_retries: int = 3, base_delay: float = 1.0):
    """Decorator for retry logic with exponential backoff"""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_retries):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    if attempt == max_retries - 1:
                        raise
                    delay = base_delay * (2 ** attempt)
                    logger.warning(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay}s...")
                    time.sleep(delay)
        return wrapper
    return decorator

# Example usage
@retry_with_backoff(max_retries=3, base_delay=1.0)
def execute_with_retry(crew: Crew, inputs: dict = None):
    """Execute crew with automatic retry on failure"""
    if inputs:
        return crew.kickoff(inputs=inputs)
    return crew.kickoff()

print("Retry logic pattern defined!")

## Practice Exercise: Build a Production-Ready System

Combine all the patterns learned to create a robust AI application.

In [None]:
# EXERCISE: Production-Ready Document Processor
# Build a system that:
# 1. Uses custom tools to read/process documents
# 2. Returns structured output (Pydantic)
# 3. Includes error handling and logging
# 4. Uses a Flow to orchestrate the workflow

class DocumentProcessingState(BaseModel):
    document_path: str = ""
    content: str = ""
    analysis: Optional[ArticleSummary] = None
    error: Optional[str] = None

class DocumentProcessorFlow(Flow[DocumentProcessingState]):
    """Production-ready document processing flow"""
    
    @start()
    def load_document(self):
        """Load and validate the document"""
        # TODO: Implement document loading with error handling
        pass
    
    @listen(load_document)
    def analyze_document(self, content):
        """Analyze document using a Crew"""
        # TODO: Create a crew with structured output
        pass
    
    @listen(analyze_document)
    def save_results(self, analysis):
        """Save analysis results"""
        # TODO: Implement result persistence
        pass

# Implementation hints:
# - Use try/except blocks in each step
# - Log important events
# - Validate inputs using Pydantic
# - Handle edge cases (empty document, invalid format, etc.)

print("Exercise: Build your production-ready document processor!")

## Summary and Best Practices Checklist

Congratulations! You have completed the CrewAI learning path.

### Key Takeaways:

**Custom Tools:**
- Use BaseTool class for complex tools with validation
- Use @tool decorator for simple tools
- Always include clear descriptions for AI understanding

**Structured Outputs:**
- Define Pydantic models for predictable outputs
- Use Field descriptions to guide the AI
- Validate outputs before using them

**Memory:**
- Enable memory for context-aware agents
- Configure appropriate memory providers
- Consider privacy implications

**Production Patterns:**
- Use configuration management for flexibility
- Implement comprehensive logging
- Add retry logic for resilience
- Monitor execution metrics

### Production Checklist:
- [ ] Environment variables for all secrets
- [ ] Structured logging configured
- [ ] Error handling in all flows
- [ ] Rate limiting implemented
- [ ] Retry logic for API calls
- [ ] Input validation with Pydantic
- [ ] Output validation with Pydantic
- [ ] Monitoring and alerting set up
- [ ] Documentation complete

---

**Resources:**
- [CrewAI Documentation](https://docs.crewai.com)
- [CrewAI GitHub](https://github.com/crewai-inc/crewai)
- [CrewAI Community](https://community.crewai.com)