# ðŸ““ The GenAI Revolution Cookbook

**Title:** How to Build Multi-Agent AI Systems with CrewAI and YAML

**Description:** Build production-ready multi-agent AI systems with CrewAI using reusable YAML-first patterns, explicit tools and tasks, guardrails, and interactive training loops.

---

*This jupyter notebook contains executable code examples. Run the cells below to try out the code yourself!*



## Introduction

This guide shows you how to build a multi-agent customer feedback analysis system using CrewAI. You'll create three specialized agentsâ€”one for sentiment analysis, one for summarization, and one for visualizationâ€”that work together in a sequential workflow to process customer feedback, extract insights, and generate a final report with charts.

You'll learn how to define agents and tasks in YAML, integrate the CSVSearchTool for grounded analysis, validate structured outputs, and execute generated plotting code. All code is Colab-ready and runnable end-to-end with minimal setup.

By the end, you'll have a working multi-agent system that produces reproducible, structured insights from raw feedback data.

---

## Setup and Installation

Run the following cell to install CrewAI and its dependencies. We pin versions for reproducibility.

In [None]:
!pip install -q crewai==0.28.0 crewai-tools==0.1.6 pandas matplotlib pyyaml

Set your OpenAI API key. In Colab, use the following cell to securely input your key:

In [None]:
import os
from getpass import getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")

print("API key set successfully.")

---

## Why Use CrewAI for This Problem

CrewAI provides structured agent and task definitions, built-in coordination patterns, and YAML-first configuration. Compared to [LangGraph](/article/how-to-build-a-stateful-ai-agent-with-langgraph-step-by-step-5) (which requires manual node orchestration) and single-agent ReAct patterns, CrewAI minimizes boilerplate and clarifies agent responsibilities. YAML configs enable version control and code-reviewable changes, while the sequential process simplifies testing and auditability.

For this tutorial, CrewAI's CSV tool integration and memory features allow agents to reference concrete feedback rows and share context across tasks, reducing hallucinations and improving output quality.

---

## Core Concepts

### Agents

Agents encapsulate role, goal, and backstory. You define these in YAML and instantiate them in Python. Agents can be assigned tools (e.g., CSVSearchTool) and memory to make capabilities explicit and auditable.

### Tasks

Tasks describe what an agent should do and what output format is expected. CrewAI tasks include a description, expected output schema, and an assigned agent. Tasks execute sequentially by default, passing context forward.

### Tools

Tools extend agent capabilities. The CSVSearchTool allows agents to query CSV files directly, grounding analysis in real data and reducing hallucinations.

### Crew and Process

A Crew assembles agents and tasks into a workflow. The Process.sequential mode executes tasks in order, ensuring each task can reference prior outputs.

---

## Using CrewAI in Practice

### Step 1: Create Sample Data

Generate a sample customer feedback CSV for testing. This cell creates the file if it doesn't exist.

In [None]:
import pandas as pd
from pathlib import Path

Path("figures").mkdir(exist_ok=True)
csv_path = Path("customer_feedback.csv")

if not csv_path.exists():
    data = [
        {"customer_id": 1, "feedback_text": "Loved the new UI, checkout was smooth!", "rating": 5, "date": "2024-09-01"},
        {"customer_id": 2, "feedback_text": "Support was slow. Still unresolved ticket.", "rating": 2, "date": "2024-09-03"},
        {"customer_id": 3, "feedback_text": "Great pricing but the app crashes sometimes.", "rating": 3, "date": "2024-09-05"},
        {"customer_id": 4, "feedback_text": "The UX is confusing on mobile.", "rating": 2, "date": "2024-09-08"},
        {"customer_id": 5, "feedback_text": "Fast delivery and friendly service!", "rating": 5, "date": "2024-09-12"},
    ]
    pd.DataFrame(data).to_csv(csv_path, index=False)
    print("Sample customer_feedback.csv created.")
else:
    print("customer_feedback.csv already exists.")

### Step 2: Define Agents in YAML

Create an `agents.yaml` file to define the three agents. This cell writes the file to disk.

In [None]:
agents_yaml_content = """feedback_analysis_agent:
  role: "Customer Sentiment Analyst"
  goal: "Evaluate sentiment and themes from customer feedback accurately and reproducibly."
  backstory: >
    You analyze feedback at scale, balancing qualitative nuance with structured outputs.
    You use CSV search to reference concrete feedback rows. You avoid hallucinations.

summary_report_agent:
  role: "Insights Summarization Specialist"
  goal: "Produce an executive-ready summary table and concise narrative from analyzed feedback."
  backstory: >
    You write clearly, focus on business-relevant insights, and adhere to requested output formats.

visualization_agent:
  role: "Visualization Specialist"
  goal: "Generate executable Python code for sentiment distribution and trend charts."
  backstory: >
    You produce matplotlib code that reads customer_feedback.csv and saves figures to ./figures/.
"""

Path("agents.yaml").write_text(agents_yaml_content, encoding="utf-8")
print("agents.yaml created.")

### Step 3: Define Tasks in YAML

Create a `tasks.yaml` file to define the four tasks. Each task specifies a description, expected output format, and will be assigned to an agent in code.

In [None]:
tasks_yaml_content = """sentiment_evaluation:
  description: >
    Read customer_feedback.csv. For each feedback row, infer sentiment ("positive", "neutral", or "negative"),
    and extract up to 3 themes (e.g., "pricing", "support", "ux"). Summarize counts per sentiment and
    top themes overall.
  expected_output: >
    JSON with keys:
    - "row_sentiments": list of {customer_id, date, sentiment, themes: [str]}
    - "sentiment_counts": {positive: int, neutral: int, negative: int}
    - "top_themes": list of {theme: str, count: int}

summary_table_creation:
  description: >
    Using sentiment evaluation JSON, create a concise summary table highlighting overall sentiment distribution,
    top 5 themes with counts, and 3 bullet insights for executives.
  expected_output: >
    Markdown containing:
    - "Sentiment Distribution" table with columns [sentiment, count]
    - "Top Themes" table with columns [theme, count]
    - "Key Insights" as 3 bullet points

chart_visualization:
  description: >
    Output executable Python/matplotlib code that reads customer_feedback.csv and saves
    figures to ./figures/sentiment_dist.png and ./figures/sentiment_trend.png.
  expected_output: >
    Python code block that generates and saves the two charts.

final_report_assembly:
  description: >
    Assemble the final report including: a brief narrative (<=150 words), the summary table markdown,
    and references to the generated figures. Ensure the report is clean and ready to share.
  expected_output: >
    Markdown with sections: "Overview", "Summary", "Figures", "Appendix: Method".
"""

Path("tasks.yaml").write_text(tasks_yaml_content, encoding="utf-8")
print("tasks.yaml created.")

### Step 4: Load YAML Configurations

Load the YAML files into Python dictionaries using this helper function.

In [None]:
import yaml

def load_yaml(path: str) -> dict:
    with open(path, "r", encoding="utf-8") as f:
        return yaml.safe_load(f)

agents_cfg = load_yaml("agents.yaml")
tasks_cfg = load_yaml("tasks.yaml")
print("YAML configurations loaded.")

### Step 5: Instantiate the CSVSearchTool

Create a CSVSearchTool instance for the feedback CSV. This tool will be assigned to the feedback analysis agent.

In [None]:
from crewai_tools import CSVSearchTool

csv_tool = CSVSearchTool(csv="customer_feedback.csv")
print("CSVSearchTool instantiated.")

### Step 6: Create Agent Instances

Instantiate the three agents from the YAML configuration. The feedback analysis agent receives the CSV tool and memory. The summary agent also has memory to reference prior outputs. The visualization agent does not need memory.

In [None]:
from crewai import Agent

feedback_analysis_agent = Agent(
    role=agents_cfg["feedback_analysis_agent"]["role"],
    goal=agents_cfg["feedback_analysis_agent"]["goal"],
    backstory=agents_cfg["feedback_analysis_agent"]["backstory"],
    tools=[csv_tool],
    memory=True
)

summary_report_agent = Agent(
    role=agents_cfg["summary_report_agent"]["role"],
    goal=agents_cfg["summary_report_agent"]["goal"],
    backstory=agents_cfg["summary_report_agent"]["backstory"],
    memory=True
)

visualization_agent = Agent(
    role=agents_cfg["visualization_agent"]["role"],
    goal=agents_cfg["visualization_agent"]["goal"],
    backstory=agents_cfg["visualization_agent"]["backstory"],
    memory=False
)

print("Agents created.")

### Step 7: Create Task Instances

Instantiate the four tasks from the YAML configuration and assign each to its corresponding agent.

In [None]:
from crewai import Task

sentiment_evaluation = Task(
    description=tasks_cfg["sentiment_evaluation"]["description"],
    expected_output=tasks_cfg["sentiment_evaluation"]["expected_output"],
    agent=feedback_analysis_agent
)

summary_table_creation = Task(
    description=tasks_cfg["summary_table_creation"]["description"],
    expected_output=tasks_cfg["summary_table_creation"]["expected_output"],
    agent=summary_report_agent
)

chart_visualization = Task(
    description=tasks_cfg["chart_visualization"]["description"],
    expected_output=tasks_cfg["chart_visualization"]["expected_output"],
    agent=visualization_agent
)

final_report_assembly = Task(
    description=tasks_cfg["final_report_assembly"]["description"],
    expected_output=tasks_cfg["final_report_assembly"]["expected_output"],
    agent=summary_report_agent
)

print("Tasks created.")

### Step 8: Assemble the Crew

Create a Crew with the agents and tasks. Use Process.sequential to execute tasks in order.

In [None]:
from crewai import Crew, Process

crew = Crew(
    agents=[feedback_analysis_agent, summary_report_agent, visualization_agent],
    tasks=[sentiment_evaluation, summary_table_creation, chart_visualization, final_report_assembly],
    process=Process.sequential
)

print("Crew assembled.")

### Step 9: Run the Workflow

Execute the crew workflow and capture the final report. Measure execution time for performance tracking.

In [None]:
import time

start = time.perf_counter()
result = crew.kickoff()
elapsed = time.perf_counter() - start

print("\n=== Final Report ===\n")
print(result)
print(f"\nElapsed: {elapsed:.2f}s")

---

## Run and Evaluate

### Validate Structured JSON Output

Check that the sentiment evaluation task produced valid JSON with the expected schema.

In [None]:
import json

def validate_sentiment_json(output: str) -> bool:
    try:
        analysis_json = json.loads(output)
        required_keys = {"row_sentiments", "sentiment_counts", "top_themes"}
        if not required_keys.issubset(analysis_json.keys()):
            print("Missing required keys in sentiment analysis output.")
            return False
        print("Sentiment JSON is valid.")
        return True
    except Exception as e:
        print(f"Analysis output is not valid JSON: {e}")
        return False

# Assuming sentiment_evaluation.output contains the JSON string
# validate_sentiment_json(sentiment_evaluation.output)

### Execute Visualization Code

If the visualization agent returned executable Python code, run it to generate the charts. This cell executes the code in a sandboxed namespace.

In [None]:
def run_plotting_snippet(snippet: str):
    namespace = {}
    try:
        exec(snippet, namespace, namespace)
        print("Plotting code executed successfully.")
    except Exception as e:
        print(f"Error executing plotting code: {e}")

# Example usage:
# if "plt." in chart_visualization.output:
#     run_plotting_snippet(chart_visualization.output)

### Verify Generated Figures

Check that the figures were saved to disk.

In [None]:
import os

expected_figures = ["figures/sentiment_dist.png", "figures/sentiment_trend.png"]
for fig in expected_figures:
    if os.path.exists(fig):
        print(f"{fig} exists.")
    else:
        print(f"{fig} not found.")

---

## Conclusion

You've built a multi-agent customer feedback analysis system using CrewAI. You defined agents and tasks in YAML, integrated the CSVSearchTool for grounded analysis, validated structured outputs, and executed generated plotting code. The sequential process ensured each task could reference prior outputs, and the YAML-first approach made the system version-controllable and auditable.

Next steps: add a QA gate to validate outputs before final assembly, or parallelize independent tasks like summary and visualization to reduce execution time. For deeper dives into structured data extraction pipelines, see our [structured data extraction guide](/article/structured-data-extraction-with-llms-how-to-build-a-pipeline-3). If you're curious about building agents from scratch using the ReAct pattern, check out our [GPT-4 ReAct agent tutorial](/article/how-to-build-an-llm-agent-from-scratch-with-gpt-4-react-2).