A2A Research — 5-Agent Web Research & Verification

A research-and-verification pipeline coordinated by an in-process A2A registry and client. Five agents — Planner, Searcher, Reader, FactChecker, Synthesizer — run on a mix of runtimes (PocketFlow planner, smolagents search/read, LangGraph fact-check loop, Pydantic AI synthesis). The system uses live web search and page extraction instead of a local RAG corpus: it decomposes queries into claims, gathers evidence from the public web, verifies claims, and renders a structured markdown report.

Architecture

                    ┌── Searcher (web search)
Planner ──► FactChecker ──┤── Reader (fetch + extract) ──► … loop … ──► Synthesizer
                    └──────────────────────────────────────────────────┘

Agent	Runtime	Role	Output
Planner	PocketFlow	Decomposes the user query into claims and seed search queries	`claims`, `seed_queries`
Searcher	smolagents	Parallel Tavily + DuckDuckGo search	`hits`, `errors`
Reader	smolagents	Fetches URLs and extracts main text (trafilatura)	`pages`
FactChecker	LangGraph	Orchestrates search/read/LLM verify loop until evidence converges or search exhausts	`verified_claims`, `sources`
Synthesizer	Pydantic AI	Structured report from verified claims + citations	`ReportOutput` → markdown

Orchestration: workflow/coordinator.py — run_research_async / run_research_sync drive the pipeline via A2AClient and AgentRegistry.
A2A: a2a_research.a2a — role-scoped AgentExecutor registration, agent cards, in-process task store.
Evidence: a2a_research.tools — web_search, fetch_and_extract.
UI: Mesop web app (src/a2a_research/ui/app.py).

Quick Start

# 1. Install dependencies (use make dev to also setup pre-commit hooks)
make install
# Or: uv sync --all-groups

# 2. Configure credentials
# macOS/Linux: cp .env.example .env
# Windows PowerShell: Copy-Item .env.example .env
# Edit .env — set LLM_API_KEY, optional TAVILY_API_KEY for better search

# 3. Start the Mesop UI
make mesop
# Or: uv run mesop src/a2a_research/ui/app.py
# Opens at http://localhost:32123

# Run tests (no API key required for unit tests)
make test
# Or: uv run pytest

Configuration (`.env`)

All settings are environment variables. The LLM stack is provider-agnostic (same LLM_* keys drive Planner, FactChecker, Searcher/Reader hosts, and the Pydantic AI Synthesizer).

LLM Provider

Variable	Description	Default
`LLM_PROVIDER`	Vendor: `openrouter`, `openai`, `anthropic`, `google`, `ollama`	`openrouter`
`LLM_MODEL`	Model name (provider-specific)	`openrouter/elephant-alpha`
`LLM_BASE_URL`	OpenAI-compatible base URL (OpenRouter/Ollama); optional for native Anthropic/Google	see `.env.example`
`LLM_API_KEY`	API key for your chosen provider	(required for cloud)

Web search & research loop

Variable	Description	Default
`TAVILY_API_KEY`	Tavily API key; blank = DuckDuckGo-only search	(empty)
`SEARCH_MAX_RESULTS`	Per-provider hit cap (Tavily + DDG merged)	`5`
`RESEARCH_MAX_ROUNDS`	Max FactChecker loop rounds	`3`
`WORKFLOW_TIMEOUT`	Coordinator timeout (seconds)	`180`

Application

Variable	Description	Default
`LOG_LEVEL`	`DEBUG`, `INFO`, `WARNING`, `ERROR`	`INFO`
`MESOP_PORT`	Mesop web server port	`32123`

Provider Setup Examples

OpenRouter (default):

LLM_PROVIDER=openrouter
LLM_MODEL=openrouter/elephant-alpha
LLM_API_KEY=sk-or-...

OpenAI:

LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini
LLM_API_KEY=sk-...

Ollama (local, no API key):

LLM_PROVIDER=ollama
LLM_MODEL=llama3.2
LLM_BASE_URL=http://localhost:11434/v1
EMBEDDING_PROVIDER=ollama
EMBEDDING_BASE_URL=http://localhost:11434
EMBEDDING_API_KEY=

Anthropic:

LLM_PROVIDER=anthropic
LLM_MODEL=claude-sonnet-4-20250514
LLM_API_KEY=sk-ant-...

Data Flow

1. Research pipeline (web evidence)

User query
  │
  ▼
Planner ──► LLM ──► claims + seed_queries
  │
  ▼
FactChecker (LangGraph loop)
  │     Searcher ──► Tavily + DDG ──► URLs
  │     Reader ──► fetch + trafilatura ──► page text
  │     Verify ──► LLM ──► verdicts / follow-up queries
  │     (repeat until converged, max rounds, or search exhausted)
  │
  ▼
Synthesizer ──► LLM (structured) ──► ReportOutput → markdown

Planner, FactChecker verification, and Synthesizer call get_llm() in providers.py. Searcher/Reader use smolagents with OpenAI-compatible models from the same LLM_* settings.

2. Workflow entrypoints

Public API:

from a2a_research.workflow import run_research_sync, run_research_async

Implementation lives in src/a2a_research/workflow/coordinator.py (linear coordinator over A2AClient). Older PocketFlow builder/adapter modules may still exist for experiments; the Mesop UI and tests target run_research_async.

3. UI

The Mesop app exposes five sections:

Query input — textarea at the bottom; submitting triggers the full pipeline
Agent timeline — per-role card for Planner, Searcher, Reader, FactChecker, Synthesizer (PENDING → RUNNING → COMPLETED/FAILED)
Verified claims — verdict badges (✅ SUPPORTED / ❌ REFUTED / ⚠️ INSUFFICIENT_EVIDENCE / …), confidence, sources, snippets
Sources panel — deduplicated URLs from the FactChecker
Final report — markdown from the Synthesizer (session.final_report)

ResearchSession is the source of truth for timeline, errors, and outputs.

Project Structure

src/a2a_research/
├── agents/          # Planner (pocketflow), Searcher/Reader (smolagents), FactChecker (langgraph), Synthesizer (pydantic_ai)
├── a2a/             # AgentRegistry, A2AClient, agent cards, task helpers
├── workflow/        # run_research_* coordinator
├── tools/           # web_search, fetch_and_extract
├── models/          # ResearchSession, Claim, AgentRole, ReportOutput, WebSource, …
├── providers.py     # get_llm() — shared LLM vendor abstraction
├── ui/              # Mesop web app (app.py, components, …)
└── settings.py      # pydantic-settings (`LLM_*`, Tavily, timeouts, …)

data/corpus/         # Optional sample markdown (not used by the default web pipeline)
tests/               # Pytest suite (no API key required for unit tests)

Example Chat Agent Scaffolds (other frameworks)

src/a2a_research/agents/ also ships three reference basic chat agents built on alternative frameworks — not wired into the research pipeline by default, but ready to be registered as A2A handlers when you want to experiment with a different runtime.

Folder	Framework	Canonical primitive	Highlight
`agents/langgraph/`	LangGraph	`StateGraph` + `MessagesState` + `InMemorySaver`	Multi-turn memory via `thread_id=session.id`
`agents/pydantic_ai/`	Pydantic AI	`Agent(model, instructions=…)` with `OpenAIChatModel`	Typed, `deps_type` for request-scoped context
`agents/smolagents/`	smolagents	`ToolCallingAgent(tools=[], …)` with `OpenAIServerModel`	No Python execution; see folder README for the `CodeAgent` security note

Each folder exposes the same surface:

from a2a_research.agents.langgraph import chat_invoke  # or pydantic_ai / smolagents
from a2a_research.models import ResearchSession

session = ResearchSession(query="What is RAG?")
result = chat_invoke(session)
print(result.raw_content)

Run the standalone CLI demo per framework:

uv run python -m a2a_research.agents.langgraph   "hi"
uv run python -m a2a_research.agents.pydantic_ai "hi"
uv run python -m a2a_research.agents.smolagents  "hi"

Replace a default agent by registering an AgentExecutor factory (a zero-argument callable that returns a new executor instance from a2a.server.agent_execution):

from a2a_research.a2a import register_executor_factory
from a2a_research.agents.pocketflow.planner.main import PlannerExecutor
from a2a_research.models import AgentRole

register_executor_factory(AgentRole.PLANNER, PlannerExecutor)

Use any class or zero-argument callable that produces an executor; the built-in defaults are registered from a2a_research.a2a.registry._register_defaults.

See each folder's README.md for framework-specific details (streaming hooks, multi-turn patterns, DI, and security notes).

Demo Flow

# Start UI (ensure .env has LLM_API_KEY; optional TAVILY_API_KEY)
make mesop
# Open http://localhost:32123

Programmatic Use

from a2a_research.workflow import run_research_sync

session = run_research_sync("What is retrieval-augmented generation?")

print(session.final_report)

for claim in session.claims:
    print(f"{claim.verdict.value} ({claim.confidence:.0%}): {claim.text}")

print(session.error)

Development

The project includes a self-documenting Makefile. Run make or make help to see all available commands:

$ make
  install         Install package with uv
  dev             Full dev setup: install + activate pre-commit hooks
  test            Run pytest suite with coverage
  watch           Run pytest in watch mode (re-runs on file changes)
  lint            Run ruff linter
  format          Format code with ruff
  format-check    Check formatting without modifying files
  typecheck       Run mypy type checker
  typecheck-ty    Run ty type checker
  check           Run all quality checks (no tests)
  all             Run tests + all quality checks (CI-ready)
  clean           Remove build artifacts and cache directories
  mesop           Start Mesop UI (with MESOP_STATE_SESSION_BACKEND=memory)
  htmlcov         Generate HTML coverage report

Common Workflows

Setup for development:

make dev           # Install + activate pre-commit hooks

During development:

make test          # Run tests
make watch         # Run tests in watch mode (TDD)

Before committing:

make check         # Run lint + typecheck + format-check (fast)
make all           # Run everything including tests (CI-ready)

Run the UI:

make mesop         # Start the Mesop UI dev server

Direct Commands

You can also run tools directly without Make:

uv run ruff check src/ tests/     # lint
uv run ruff format src/ tests/    # format
uv run mypy src/                  # type check (strict py311)
uv run ty check src/              # type check with ty
uv run pytest                     # run test suite

Pre-commit Hooks

Install pre-commit hooks to catch issues before pushing:

pre-commit install

CI/CD

GitHub Actions runs linting, formatting checks, type checking, and tests on every push and pull request to main. See .github/workflows/ci.yml for details.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github		.github
src/a2a_research		src/a2a_research
static		static
tests		tests
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock
verify_homepage.png		verify_homepage.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A2A Research — 5-Agent Web Research & Verification

Architecture

Quick Start

Configuration (`.env`)

LLM Provider

Web search & research loop

Application

Provider Setup Examples

Data Flow

1. Research pipeline (web evidence)

2. Workflow entrypoints

3. UI

Project Structure

Example Chat Agent Scaffolds (other frameworks)

Demo Flow

Programmatic Use

Development

Common Workflows

Direct Commands

Pre-commit Hooks

CI/CD

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

A2A Research — 5-Agent Web Research & Verification

Architecture

Quick Start

Configuration (.env)

LLM Provider

Web search & research loop

Application

Provider Setup Examples

Data Flow

1. Research pipeline (web evidence)

2. Workflow entrypoints

3. UI

Project Structure

Example Chat Agent Scaffolds (other frameworks)

Demo Flow

Programmatic Use

Development

Common Workflows

Direct Commands

Pre-commit Hooks

CI/CD

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Configuration (`.env`)

Packages