A multi-agent AI playground built with Next.js 16 and the Vercel AI SDK. It ships nine multi-agent coordination patterns (v1–v9) over a shared event/message layer — each exposed as its own streaming API route and selectable from the chat UI. Bring your own API key for OpenAI, Anthropic, Mistral, or Fireworks and watch the agents reason, call tools, and converge live.
| Landing | Chat |
|---|---|
![]() |
![]() |
| Settings — bring your own key | Architecture pages |
|---|---|
![]() |
![]() |
The repo demonstrates nine complementary ways to coordinate LLM agents. Every mode streams live (agent reasoning, tool calls, web searches), surfaces an estimated cost per step and per run, and persists to a localStorage chat history.
| Mode | Pattern | In one line | Best when | Trade-off |
|---|---|---|---|---|
| v1 | Orchestrated | Coordinator plans and delegates to research / write / edit specialists, then synthesizes. | Linear content pipelines with clear steps. | The coordinator is a bottleneck; no parallelism. |
| v2 | Choreographed | Backend / frontend / design peers message each other via a bus until all complete. | Cross-functional tasks where peers negotiate a shared artifact. | Negotiation can loop; convergence isn't guaranteed. |
| v3 | Hierarchical | A lead spawns sub-agents at runtime (depth-capped), which run in parallel and roll results up. | Open-ended tasks that break into nested subtasks. | Emergent tree shape is less predictable; costs more tokens. |
| v4 | Evaluator–Optimizer | Generate → critique → revise until a quality bar is met. | Iteratively improving a single artifact to a bar. | Cost grows per round; a strict critic can burn the budget. |
| v5 | Debate | Opposing sides argue; a judge decides and synthesizes. | Decisions where the strongest case for each side matters. | Adds argument rounds; verdict quality depends on the judge. |
| v6 | Blackboard | Agents share one workspace; a controller picks who acts next. | Answers that assemble from many partial contributions. | Controller selection can loop; coordination is slower. |
| v7 | Market | Agents bid on posted tasks; the best bid wins. | Heterogeneous work where the best agent isn't obvious. | The bid round is extra LLM calls; worth it for larger pools. |
| v8 | Self-Consistency | Sample N attempts in parallel; a judge picks/merges the best. | Questions where agreement across attempts signals quality. | N samples cost N× tokens for the sampling step. |
| v9 | Swarm | Identical agents build on a shared scratchpad over rounds. | Open-ended ideation that benefits from many cheap passes. | No structure → redundancy/drift; round-capped. |
Each pattern has a dedicated detail page at /architectures/[mode] with a
diagram, the trade-offs, and source references. See also the deep dive in
docs/ARCHITECTURE.md.
| Route | What it is |
|---|---|
/ |
Landing page — hero + cards for all nine architectures (server component). |
/chat |
The full chat app — mode + model selector, live timeline/tree, history. |
/architectures/[mode] |
Per-pattern detail page (diagram, trade-offs, references). |
/references |
A reading list of the papers/posts behind each pattern. |
The app proxies every model call through its own /api/* routes — the browser
never talks to a provider directly. You can supply your own key per provider
in Settings; it's stored in your browser (localStorage) and sent per
request, never persisted on the server or logged.
| Provider | Example models | Env var | Web search |
|---|---|---|---|
| OpenAI | gpt-5.5 (default), gpt-5.4, gpt-5.4-mini, o4-mini |
OPENAI_API_KEY |
✅ (hosted) |
| Anthropic | claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5 |
ANTHROPIC_API_KEY |
— |
| Mistral AI | mistral-large-latest, mistral-medium-latest, magistral-medium-latest, mistral-small-latest |
MISTRAL_API_KEY |
— |
| Fireworks AI | deepseek-v3p1, llama-v3p3-70b-instruct, qwen3-235b |
FIREWORKS_API_KEY |
— |
The v1 researcher uses OpenAI's hosted web search. On non-OpenAI providers it
degrades gracefully — the agent proceeds on general knowledge rather than
crashing the run (webSearchAvailable() in lib/provider.ts).
Set PUBLIC_BYO_KEY_ONLY=true (and NEXT_PUBLIC_BYO_KEY_ONLY=true for the UI
banner) and the server ignores its own env keys — every visitor must bring
their own. This is how the app is hosted publicly without the owner's key being
spent. Unset locally, the env key works as usual.
┌──────────────┐
user ─────▶ │ /api/agents │ (SSE stream)
└──────┬───────┘
▼
┌────────────────────────┐
│ AgentOrchestrator │ ◀── reads bus, builds prompts
│ - LLM-driven routing │ per-run agents w/ event hooks
│ - max 15 iterations │
└───────────┬────────────┘
▼
┌────────────┐
│coordinator │ ◀──── always starts here
└─────┬──────┘
┌──────────────┼──────────────┐
▼ ▼ ▼
┌──────────┐ ┌────────┐ ┌────────┐
│researcher│ │ writer │ │ editor │
└──────────┘ └────────┘ └────────┘
┌──────────────────┐
user ─────▶│ /api/agents-v2 │ (SSE stream)
└────────┬─────────┘
▼
┌─────────────────────────┐
│ AgentRunner │
│ - round-robin schedule │
│ - inbox-fed prompts │
│ - all-must-complete │
└───────────┬─────────────┘
┌───────────────┼───────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ backend │ ◀──▶│frontend │ ◀──▶│ design │
└─────────┘ └─────────┘ └─────────┘
└── pub/sub via per-conversation bus ──┘
┌──────────────────┐
user ─────▶│ /api/agents-v3 │ (SSE stream)
└────────┬─────────┘
▼
┌──────────────┐
│ Lead │ spawnSubAgent() at runtime
└──────┬───────┘
┌─────────────┴─────────────┐
▼ ▼
┌─────────────┐ ┌─────────────┐
│ sub-agent A │ │ sub-agent B │ (run in parallel)
└──────┬──────┘ └──────┬──────┘
┌───┴───┐ ┌───┴───┐
▼ ▼ ▼ ▼
leaf leaf leaf leaf (depth-capped)
Caps: depth 2, width 4 per node, 15 total nodes. Each parent runs a synthesis pass over its children's deliverables.
┌──────────────────┐
user ─────▶│ /api/agents-v4 │ (SSE stream)
└────────┬─────────┘
▼
┌────────────────────┐
│ Generator │ ◀───────────┐
│ writes a draft │ │ revise with
└─────────┬──────────┘ │ the critic's
▼ │ issues
┌────────────────────┐ │
│ Critic │ ── issues ──┘
│ scores 0–10 │
└─────────┬──────────┘
▼
score ≥ 8 ? ──▶ done (else loop, up to 4 rounds)
Pass bar: score ≥ 8, max 4 rounds (keeps the best draft if never passed).
┌──────────────────┐
user ─────▶│ /api/agents-v5 │ (SSE stream)
└────────┬─────────┘
▼
┌─────────────┐ ┌─────────────┐
│ Affirmative │ ◀──▶ │ Opposing │ 3 rounds of argument
└──────┬──────┘ └──────┬──────┘
└───────────┬─────────┘
▼
┌─────────────┐
│ Judge │ picks a winner + synthesizes
└─────────────┘
Two debaters argue opposing stances for 3 rounds; a judge rules and synthesizes.
┌──────────────────┐
user ─────▶│ /api/agents-v6 │ (SSE stream)
└────────┬─────────┘
▼
┌──────────────────────────┐
│ Controller (each round)│ picks who acts next
└────────────┬─────────────┘
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ analyst │ │ planner │ │ critic │
└────┬─────┘ └────┬─────┘ └────┬─────┘
└─────────────────┼─────────────────┘
▼
┌──────────────────────────┐
│ shared blackboard │ named sections, read/write
└──────────────────────────┘
A content-driven controller picks one role per round; agents read/write named
sections of a shared board until a solution settles, or 8 rounds.
┌──────────────────┐
user ─────▶│ /api/agents-v7 │ (SSE stream)
└────────┬─────────┘
▼
┌────────────────────┐
│ post tasks to the │
│ auction board │
└─────────┬──────────┘
▼ agents bid (fit 0–1, est. $)
┌────────────┬─────────────┬────────────┐
▼ ▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌──────────┐ ┌─────────┐
│researcher│ │ engineer │ │ designer │ │ analyst │
└─────────┘ └──────────┘ └──────────┘ └─────────┘
└────────────┴──────┬──────┴────────────┘
▼
award each task to its best bid (≤ 2 tasks per agent)
Tasks are auctioned to a roster of four; the highest-fit bid wins each, greedily, with a per-agent bundle cap of 2.
┌──────────────────┐
user ─────▶│ /api/agents-v8 │ (SSE stream)
└────────┬─────────┘
▼ same prompt, N parallel samples
┌────────────┬─────────────┬────────────┐
▼ ▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│sample 1 │ │sample 2 │ │sample 3 │ │sample 4 │ (run in parallel)
└────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘
└────────────┴──────┬──────┴────────────┘
▼
┌───────────────┐
│ Judge │ selects or merges the best
└───────────────┘
Draws 4 independent samples in parallel; a judge selects the best or merges them into a consensus.
┌──────────────────┐
user ─────▶│ /api/agents-v9 │ (SSE stream)
└────────┬─────────┘
▼ N identical agents act every round
┌────────────┬─────────────┬────────────┐
▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│ agent │ │ agent │ │ agent │ │ agent │ (no roles, no controller)
└───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘
└────────────┴──────┬──────┴────────────┘
▼
┌──────────────────────────┐
│ shared scratchpad │ each leaves a trace; next round
└──────────────────────────┘ builds on it (3 rounds)
4 identical agents build on a shared scratchpad over 3 rounds — pure stigmergy: no roles, no controller, no direct messaging.
For the full comparison and trade-offs see docs/ARCHITECTURE.md
and each /architectures/[mode] page.
- Nine coordination patterns — switchable from the chat UI, each with its own live visualization (timeline, tree, build-plan board, score ladder, debate, blackboard, auction, sample fan-out, swarm scratchpad).
- Four providers, BYO key — OpenAI / Anthropic / Mistral / Fireworks, entered in Settings, stored client-side, never persisted server-side.
- Live streaming — agent reasoning (
onStepFinish), tool calls, and web searches stream as they happen. - Cost estimates — per-step and per-run token cost (indicative pricing in
lib/models.ts), in the timeline, message meta, and terminal logs. - Structured report deliverable — modes can return a
ReportView(KPIs, charts via recharts, tables) rendered inline in chat. - Chat history — conversations persist in localStorage; runs continue in the background when you switch chats and write back to their originating chat.
- Code preview sidecar — detects code/JSON; a slide-over with syntax highlighting, copy, and a sandboxed live HTML/React preview.
- Debug stream — a drawer of every event with pretty-printed JSON.
- Human-in-the-loop — v1 can pause and ask the user a question mid-run.
- Per-conversation message bus — no cross-request state leakage.
- Hardened — input validation + CSP/security headers; see
SECURITY.md.
- Framework: Next.js 16 (App Router, Turbopack)
- Runtime: React 19
- AI SDK: Vercel AI SDK 5 (Experimental Agent API)
- Providers:
@ai-sdk/openai,@ai-sdk/anthropic,@ai-sdk/mistral,@ai-sdk/fireworks - Charts: recharts · Validation: Zod · Styling: Tailwind CSS 4
- UI: lucide-react icons, custom markdown renderer,
chalk+boxenlogs - Language: TypeScript · Package Manager: pnpm
- Node.js 20.9+ (Next 16 requires it)
- pnpm (this repo uses pnpm —
npm installis not supported) - At least one provider API key (OpenAI, Anthropic, Mistral, or Fireworks)
git clone https://github.com/MarcusElwin/multi-agents-team.git
cd multi-agents-team
pnpm install
cp .env.example .env.localThen edit .env.local and add at least one key:
OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...
# MISTRAL_API_KEY=...
# FIREWORKS_API_KEY=fw_...
# PUBLIC_BYO_KEY_ONLY=true # public deploy: ignore env keys, require BYO
# NEXT_PUBLIC_BYO_KEY_ONLY=true # show the BYO banner in the UIYou can also skip .env.local entirely and enter keys in the in-app Settings
drawer.
pnpm dev # dev server → http://localhost:3000
pnpm build && pnpm start # production buildpnpm test:agents # v1 — orchestrated (coordinator + researcher + writer + editor)
pnpm test:runner # v2 — choreographed (backend + frontend + design)
pnpm test:smoke # run a tiny prompt through every mode (v1–v9); SMOKE_MODEL to overrideA built-in chat app at /chat (the landing page lives at /).
- Model selector — models grouped by provider from
lib/models.ts. The selected model applies to every agent in the run. - Mode selector — pick any of v1–v9, routing to
/api/agents…/api/agents-v9. - Settings — enter/clear per-provider API keys (stored in your browser).
- Architecture / Debug — the per-mode diagram drawer and the live event stream.
- Welcome state: mode-aware suggestion chips + a collapsible architecture panel.
- While running: each mode renders its own live view (timeline, tree, board, ladder, debate, blackboard, auction, sample fan-out, or swarm) with a running cost readout.
- Completed messages: a synthesized markdown report (long reports collapse;
code renders as blocks with a "View code" preview), a pattern-specific summary
card, and an optional structured
ReportView(KPIs/charts/tables). - Meta pills: mode, model, iterations/agents, duration, estimated cost.
- Sidebar: past conversations (localStorage), grouped by recency, running dot for in-flight runs.
⚠️ Model access: picking a model id your account isn't enabled for returns a 4xx in the chat. Unknown values fall back to the default.
All nine architectures are Server-Sent Events endpoints
(Content-Type: text/event-stream). Each data: frame is a JSON AgentEvent
(see lib/agent-events.ts); a run ends with workflow_complete (or
workflow_error). Request body: { message, model?, history?, apiKey?, provider? }.
The body is validated and bounded (lib/validate-request.ts) — see
SECURITY.md.
curl -N -X POST http://localhost:3000/api/agents \
-H "Content-Type: application/json" \
-d '{"message": "Write a blog post about AI agents", "model": "gpt-5.5"}'| Route | Mode | maxDuration |
|---|---|---|
/api/agents |
v1 orchestrated | 60s |
/api/agents-v2 |
v2 choreographed | 120s |
/api/agents-v3 |
v3 hierarchical | 300s |
/api/agents-v4 |
v4 evaluator–optimizer | 180s |
/api/agents-v5 |
v5 debate | 180s |
/api/agents-v6 |
v6 blackboard | 240s |
/api/agents-v7 |
v7 market | 240s |
/api/agents-v8 |
v8 self-consistency | 180s |
/api/agents-v9 |
v9 swarm | 240s |
/api/agents/input |
human-in-the-loop answer delivery | — |
Each run route also has a GET returning a small readiness/status object.
export type ProviderId = 'openai' | 'anthropic' | 'mistral' | 'fireworks';
export const DEFAULT_MODEL: ModelId = 'gpt-5.5';
export const MODEL_OPTIONS: ModelOption[]; // { value, label, provider, description }
export const PROVIDERS: Record<ProviderId, { envVar; keyPrefix; createClient(apiKey) }>;
export function providerForModel(model): ProviderId;
export function resolveModel(input?): ModelId; // safe, with fallback
export function estimateCost(model, usage): number; // USDPer-request credentials are resolved and isolated via AsyncLocalStorage
(lib/provider.ts): withProvider({ providerId, apiKey }, fn) wraps a run, and
agents call provider() instead of importing a provider directly — so adding a
provider is one entry in PROVIDERS, not a signature change across 16 files.
| v1 | v2 | v3 | |
|---|---|---|---|
| Iteration / node cap | maxIterations = 15 |
maxIterations = 10 |
depth 2 · width 4 · 15 nodes |
API maxDuration |
60s | 120s | 300s |
v4–v9 are round/sample-capped; see each runner in lib/ and the route
maxDuration table above.
multi-agents-team/
├── app/
│ ├── api/agents{,-v2..-v9}/route.ts # nine SSE endpoints
│ ├── api/agents/input/route.ts # human-in-the-loop answer delivery
│ ├── architectures/[mode]/page.tsx # per-pattern detail page
│ ├── chat/page.tsx # the full chat app
│ ├── references/page.tsx # reading list
│ ├── components/ # timeline, tree, drawers, ReportView, …
│ ├── hooks/ # useConversations, useSettings
│ └── page.tsx # landing page
├── lib/
│ ├── agents{,-v2,-v3..}/ # per-mode agent factories
│ ├── tools/ # web-search, report, …
│ ├── *-runner.ts / orchestrator.ts # one runner per mode
│ ├── models.ts # catalog, PROVIDERS, pricing, cost
│ ├── provider.ts # AsyncLocalStorage provider injection
│ ├── modes.ts # v1–v9 mode specs (+ references)
│ ├── validate-request.ts # request-body validation/bounds
│ ├── conversation.ts / message-bus.ts # per-run state
│ └── agent-events.ts # AgentEvent union + RunSummary
├── scripts/ # test-agents, test-runner, smoke-test
├── docs/ # ARCHITECTURE.md, images/
├── SECURITY.md
└── README.md
One entry in PROVIDERS (lib/models.ts) with its createClient, plus its
models in MODEL_OPTIONS and rates in MODEL_PRICING. The settings drawer and
model selector are data-driven, so the new provider appears automatically.
- Agents in
lib/agents-vN/(factories taking(model, hooks)). - A runner in
lib/that drives the loop and emitsAgentEvents. - An SSE route at
app/api/agents-vN/route.ts. - New event variants in
lib/agent-events.ts(widen themodeunions). - A
MODES.vNspec inlib/modes.ts(label, diagram, suggestions, references). - UI wiring in
app/chat/page.tsx(+ a render component if the output is new).
Input validation, BYO-key handling, CSP/security headers, and the threat model
are documented in SECURITY.md. Found a vulnerability? Email
marcus@elwin.com — please don't open a public issue.
- "API key required" — add a key in Settings, or set the provider's env var
in
.env.local; restart the dev server after editing env vars. - 4xx on a request — the chosen model isn't enabled on your account; pick another.
- Node version error — Next 16 needs Node ≥ 20.9 (
nvm use 20). npm installerrors — use pnpm, not npm.- Cost shows $0 — the AI SDK didn't return token
usagefor that call; estimates depend on it and are indicative, not billing-accurate.
pnpm i -g vercel
vercel link # link to a Vercel project
vercel # deploy a preview; `vercel --prod` for productionThen in the Vercel dashboard → Settings → Environment Variables, add the
provider keys you want as server fallbacks, and (for a public deploy)
PUBLIC_BYO_KEY_ONLY=true + NEXT_PUBLIC_BYO_KEY_ONLY=true.
MIT License
- Built with the Vercel AI SDK
- Patterns informed by Anthropic's Building effective agents
and the sources on each
/architectures/[mode]page.
Note: This project uses the Vercel AI SDK's experimental Agent API. Features may change as the SDK evolves.



