# Lesson 9: Research and Outline Agents

In this lesson, we'll build the **first 2 agents** of our SEO pipeline:

1. **Research Agent** — Searches the web, analyzes keywords and competitor content
2. **Outline Agent** — Creates a structured outline from the research results

These are the real agents from our product. The flow:

```
Topic --> [Research Agent] --> Research notes --> [Outline Agent] --> ContentOutline (JSON)
```

Research Agent uses **DuckDuckGoTools** for web search. Outline Agent uses **output_schema** to return structured JSON instead of free-form text.

> **Note**: Claude (Anthropic) supports using `tools` and `output_schema` together. This is why we chose Claude Sonnet for these agents.

## What's Different in Module 3

In Module 2, you built simple agents with basic schemas (3 fields) and short chains (2 agents).

Now we're building the **real product**. Here's what changes:

- **Schemas are more complex** — `ContentOutline` has nested models (a list of `OutlineSection` objects, each with their own fields). This is how real data looks.
- **Instructions are longer** — production agents need precise, detailed instructions.
- **This is the actual code** — the agents below are identical to what runs when you type `python output/cli.py create "topic"`.

Don't worry if the schema code looks dense at first. Focus on **the flow**: what goes in, what comes out, how agents connect. The details will click with practice.

In [None]:
from dotenv import load_dotenv
load_dotenv()

from agno.agent import Agent
from agno.models.anthropic import Claude
from agno.tools.duckduckgo import DuckDuckGoTools
from pydantic import BaseModel, Field

## Schema: ContentOutline

Before building agents, we need to define the **output schema** — the data structure that the Outline Agent will return.

This is the **exact same schema** used in the product (`schemas.py`). When you pass `output_schema=ContentOutline` to an agent, the LLM returns JSON matching this structure exactly.

The schema has 2 classes:
- `OutlineSection` — One section in the article (corresponds to an H2)
- `ContentOutline` — The full outline: title, meta description, keywords, sections

In [None]:
class OutlineSection(BaseModel):
    heading: str = Field(description="Section heading (H2)")
    subheadings: list[str] = Field(default_factory=list, description="Sub-section headings (H3)")
    key_points: list[str] = Field(description="Bullet points to cover")
    seo_keywords: list[str] = Field(default_factory=list, description="Target keywords for this section")


class ContentOutline(BaseModel):
    title: str = Field(description="SEO-optimized article title")
    meta_description: str = Field(description="Meta description, max 160 chars")
    target_keywords: list[str] = Field(description="Primary SEO keywords")
    sections: list[OutlineSection] = Field(description="Ordered list of content sections")
    tone: str = Field(default="informative", description="Writing tone")

## Research Agent

The first agent in the pipeline. Its job:

- **Search the web** using DuckDuckGo to gather information about the topic
- **Analyze keywords** — find primary and secondary keywords
- **Check competitors** — what's ranking highly, what gaps exist
- **Return research notes** as plain text (no JSON needed yet)

This agent uses `DuckDuckGoTools()` — a built-in Agno toolkit that lets the agent automatically search the web when it needs information.

In [None]:
research_agent = Agent(
    name="Research Agent",
    model=Claude(id="claude-sonnet-4-5-20250929"),
    tools=[DuckDuckGoTools()],
    instructions=[
        "You are an expert SEO researcher.",
        "Research the given topic using web search.",
        "Identify primary and secondary keywords, analyze what top-ranking content covers, "
        "and find content gaps.",
        "Return your findings as clear, organized research notes.",
    ],
)

## Outline Agent

The second agent. Takes research notes from the Research Agent and creates a structured outline.

Key points:
- Uses `output_schema=ContentOutline` — the agent **must** return JSON in the exact format
- **No tools needed** — the Outline Agent only processes text, no web search or API calls
- Outline includes 5-8 sections with H2, H3, key points, and keywords per section

> **Recall**: `output_schema` is how Agno forces the LLM to return structured data. Instead of free-form text, you get a Pydantic object with `.title`, `.sections`, etc.

In [None]:
outline_agent = Agent(
    name="Outline Agent",
    model=Claude(id="claude-sonnet-4-5-20250929"),
    output_schema=ContentOutline,
    instructions=[
        "You are an expert content strategist.",
        "Given research notes about a topic, create a structured content outline.",
        "Include 5-8 sections with clear H2 headings, optional H3 subheadings, "
        "key points, and relevant SEO keywords per section.",
        "The title should be SEO-optimized and compelling.",
        "The meta description must be under 160 characters.",
    ],
)

## Test Run: Research --> Outline

Now we'll run both agents in sequence:

1. Research Agent searches the web for the topic
2. Outline Agent receives the research notes and creates an outline

This is the **simplest pipeline** — the output of one agent is the input for the next. In the real product, we add database tracking and error handling, but the core logic is identical.

> **Cost:** ~$0.10-0.20 (2 Sonnet calls). Takes 30-60 seconds.

In [None]:
topic = "How to optimize on-page SEO for your website"

print("Step 1: Researching...")
research = research_agent.run(f"Research this topic for an SEO article: {topic}")
print(f"Research done! ({len(research.content)} chars)\n")

print("Step 2: Creating outline...")
outline_response = outline_agent.run(
    f"Create a structured content outline from these research notes:\n\n{research.content}"
)
outline = outline_response.content

print(f"Title: {outline.title}")
print(f"Meta: {outline.meta_description}")
print(f"Keywords: {', '.join(outline.target_keywords)}")
print(f"\nSections:")
for i, section in enumerate(outline.sections, 1):
    print(f"  {i}. {section.heading}")
    for sp in section.subheadings:
        print(f"     - {sp}")

## Exercise

Change the `topic` variable to a topic relevant to your work, then re-run the test run cell above.

After the outline is generated, write code below to:
1. Print the total number of sections
2. Print the key points for the **first** section only
3. Print the meta description and check: is it under 160 characters? (use `len()`)

This is how you'd inspect and validate agent output in a real workflow.

In [None]:
# Exercise: Write your code here
