A council of 9 Claude Managed Agents that autonomously audits a small-business landing page every week, synthesizes the findings, and opens a PR with the proposed redesign.
Built with Opus 4.7 — Anthropic × Cerebral Valley Hackathon submission (deadline 2026-04-26).
A council of 9 Claude Managed Agents audits a landing page once a week, synthesizes findings across SEO, brand-voice, compliance, conversion, copy, and rendered-layout lenses, and hands the operator a reviewable draft PR. The win is cycle time — the analytical loop runs in tens of minutes instead of multi-week agency rounds — and a runtime mechanism (Critic Genealogy) where Opus 4.7 detects an unowned audit gap and registers a brand-new specialist agent against the live API mid-run. A human still reviews the PR before it ships.
weekly trigger
│
▼
Planner ──► SEO / brand / compliance / conversion / copy / monitor / visual-reviewer
│ │
│ └── out-of-scope overlap found
▼
Critic Genealogy (Opus 4.7) ──► registers new specialist ──► reruns on council branch
│
▼
Redesigner ──► draft PR for human review
The 5 pre-registered critics (SEO, brand-voice, functional-health-compliance, conversion, copy) each cover one scope. They commit findings to a shared branch and flag cross-cutting issues in their Out of scope sections.
When >=2 critics flag the same scope as unowned, scripts/critic-genealogy.ts asks Opus 4.7 to (a) author a new critic spec, (b) register it live via POST /v1/agents, and (c) invoke it on the same council branch — all at runtime, no human in the loop.
Proof it works: fixture dry-run against the 5 committed critics' findings produces a schema-valid accessibility-critic spec with 7 focus bullets, 6 cross-tagged out-of-scope bullets, and a WCAG-tuned severity rubric. Live Opus call, ~$0.03, ~15s:
bun scripts/critic-genealogy.ts --fixtures scripts/__tests__/fixtures/genealogy --dry-run weekly trigger (cron, manual, or wbs prompt)
│
▼
Claude Code orchestrator session (Opus 4.7)
│
▼
Planner (Opus 4.7)
memory + verdicts → plan.md
│
fans out 7 sessions in parallel ─────────────┐
│
┌───────┬──────────┬──────────┬────────┬────────┬─────────┴─┬───────────────┐
│ SEO │ brand │ FH-compl │ CRO │ copy │ monitor │ visual-review │
│ Sonnet│ Sonnet │ Sonnet │ Sonnet │ Sonnet │ Haiku │ Opus 4.7 │
│ 4.6 │ 4.6 │ 4.6 │ 4.6 │ 4.6 │ 4.5 │ (post) │
└───┬───┴────┬─────┴─────┬────┴────┬───┴────┬───┴─────┬─────┴──────┬────────┘
│ │ │ │ │ │ │
└────────┴─── each critic commits via GitHub MCP ──┴────────────┘
to council/<week-date> branch │
▼
Critic Genealogy (Opus 4.7, runtime)
gap? → new critic spec → POST /v1/agents
→ POST /v1/sessions → commits findings
│
▼
Redesigner (Opus 4.7)
reads all findings on branch
commits proposal.md + decision.json
│
▼
Draft PR opened
human merges → Cloudflare redeploys
Why this composition wins: Managed Agents give each critic a pre-registered scope + MCP tools + vault credentials. Runtime agent registration via POST /v1/agents lets the orchestrator spawn specialists mid-run — novel capability that callable_agents (research preview) would handicap by gating. Full detail in context/ARCHITECTURE.md.
Honest scope note: the site/ fork of the demo substrate LP and the Claude Code Routine cron wiring are manual in this submission — the composition does not depend on either. The redesigner currently emits proposal.md (the PR body) instead of proposal.diff; diff mode becomes a one-file change once the site source lands.
30-second pitch: Webster is an autonomous landing-page improvement council. Nine Claude Managed Agents plan, audit, monitor, synthesize, and package one weekly redesign proposal; the standout demo is Critic Genealogy, where Opus 4.7 detects an unowned audit gap and registers a new specialist at runtime.
Live-run evidence: the operator surface is the /webster-weekly-council skill (library: SKILL.md index + on-demand phase references + helper scripts); the full single-page runbook lives at prompts/second-wbs-session.md. Registration IDs live in environments/webster-council-env.id and context/*/id.txt. Run artifacts are written under history/<week>/ when the weekly run executes.
Demo arc artifacts: an 11-week simulation council run, week-by-week, browsable as files. Start at demo-output/landing-page/INDEX.md for the narrated walk-through. Each week directory under demo-output/landing-page/w00..w10/ contains desktop/mobile/tablet screenshots, heatmap JSON+SVG, synthetic analytics, and the visual reviewer's markdown verdict. Anthropic Managed Agents memory-store provisioning is captured at assets/memory-stores-screenshots/. The render pipeline that turns these per-week assets into a timelapse video is submission tooling and lives outside the public repo.
Hero code: scripts/critic-genealogy.ts is the runtime specialist-spawn path; scripts/__tests__/critic-genealogy.test.ts and scripts/__tests__/fixtures/genealogy are the fixture proof.
Validate locally: run bun install once, then bun run validate for type-check, zero-warning lint, format, agent schemas, findings format, markdown, and tests.
If you're evaluating this submission and have five minutes:
- Read the 30-second pitch + hero moment above (you're here) — that's the architecture and the novel-mechanic claim in one screen.
- Open
demo-output/landing-page/INDEX.md— narrated walk through the 11-week LP timelapse. One paragraph per week, links to that week's screenshots + heatmap + visual-reviewer verdict. - Click into one week's
visual-review.md(e.g.w04/visual-review.mdfor the largest beat,w10/visual-review.mdfor the terminal polish) — that's what the council actually wrote about its own changes. - Read
scripts/critic-genealogy.ts— the hero file. Two tools (report_no_gap/report_gap), Opus 4.7 picks one, then drafts a JSON spec, registers it viaPOST /v1/agents, and invokes it viaPOST /v1/sessions— all at runtime. - Optional, if a terminal is handy:
bun install && bun scripts/critic-genealogy.ts --fixtures scripts/__tests__/fixtures/genealogy --dry-run. Live Opus 4.7 call against the committed fixture findings, ~15s wall clock, prints the new critic spec it would have registered.
agents/production/ holds the 9 pre-registered specs; agents/simulation/ holds the 1:1 simulation mirror used for the timelapse run. prompts/second-wbs-session.md is the production weekly orchestrator (locked); skills/webster-weekly-council/SKILL.md is the same flow as a Claude Code skill.
webster/
├── agents/
│ ├── production/ 9 Managed Agent specs that run Nicolette's live council
│ └── simulation/ 9 LP-sim specs (1:1 mirror) that drive the timelapse demo
├── context/ architecture, features, quality gates, per-critic findings dirs
├── environments/ webster-council-env.json (single Anthropic environment)
├── prompts/ first-wbs-session.md (bootstrap), second-wbs-session.md (weekly run runbook)
├── scripts/ validate-agents, validate-findings, critic-genealogy
├── skills/ webster-weekly-council (operator surface for the weekly run),
│ webster-onboarding (first-time setup for a new operator),
│ webster-lp-audit (shared critic discipline),
│ webster-browser-audit (Playwright-headless audit capability)
├── .github/workflows/ CI: type + lint + format + schema + findings + markdown + tests
├── .husky/ pre-commit runs the same gates locally
└── AGENTS.md operator guide for in-repo work
The live council runner is a Claude Code library skill: /webster-weekly-council — slim SKILL.md index, on-demand phase references under references/, and reusable helper scripts under scripts/. The single-page bash-in-markdown runbook at prompts/second-wbs-session.md is the same flow as a scrollable readable page. Both produce identical artifacts. The flow:
- Seeds 10 weeks of mock analytics on first run (monitor needs baselines to diff).
- Prepares a shared
council/YYYY-MM-DDbranch. - Runs the planner — marshals
history/memory.jsonl, recent verdicts, and monitor anomalies; writeshistory/YYYY-MM-DD/plan.md. - Fans out 7 Managed Agent sessions (monitor + 5 critics + visual-reviewer) — each commits
context/critics/<scope>/findings.mdvia GitHub MCP. - Validates findings via
bun scripts/validate-findings.ts. - Runs the redesigner — commits
history/YYYY-MM-DD/proposal.md+decision.json. - Opens a draft PR.
Wall-clock per run is in the tens of minutes; the bulk of that is the parallel critic fan-out, not orchestration overhead.
Submission note: all 9 agent specs are registered against the live Anthropic API (IDs in environments/webster-council-env.id + context/*/id.txt), the genealogy hero is live-validated (~$0.03 Opus 4.7 dry-run documented above), and the full orchestration prompt is committed. The end-to-end fan-out that produces history/YYYY-MM-DD/ artifacts is the operator-triggered weekly run — history/ is empty at submission time by design. Loop has been exercised component-by-component.
Strict validation discipline. One command:
bun run validateChains: tsc --noEmit → eslint --max-warnings 0 → prettier --check → agent+environment schema validation → findings format validation → markdownlint → bun test. Every gate is blocking. Pre-commit hook enforces the same set. CI enforces the same set on push + PR. See context/QUALITY-GATES.md.
Current state: 29 test files green via bun run validate, 0 lint warnings, 0 type errors, 18 JSON specs valid, 6 findings files valid.
- Best Use of Claude Managed Agents — 9 pre-registered production agents (with a 1:1 sim mirror in
agents/simulation/) + runtime-registered genealogy critics, all invoked via/v1/sessionswith vault-bound GitHub MCP (no tokens inuser.message). - Creative Exploration — runtime critic genealogy. Gap detection → template-cloned spec → live
POST /v1/agents→ immediate invocation. The emergent-capability demo beat.
bun >= 1.3.0jq(for bash scripts inside the prompts)ghCLI (authenticated to the target repo)gitwith commit-signing configured- An Anthropic API key stored in macOS keychain under service
anthropic-webster. First-session will show the exactsecurity add-generic-passwordcommand if missing.
The wbs @prompts/... commands below assume a shell alias that launches Claude Code into Webster's dispatcher mode (Opus 4.7, 1M context, custom system prompt at .claude/dispatcher.md, custom settings at .claude/dispatcher-settings.json). Add to your shell rc:
alias wbs='cd ~/Projects/webster && claude --dangerously-skip-permissions --model claude-opus-4-7 \
--settings .claude/dispatcher-settings.json \
--system-prompt "$(cat .claude/dispatcher.md)"'Or run the equivalent claude --settings ... --system-prompt ... directly without aliasing. Either works.
wbs @prompts/first-wbs-session.mdRegisters the single environment + 9 production agents against the Anthropic API. Runs an SEO hello-world to prove the council loop end-to-end. Artifacts: environments/webster-council-env.id + context/{monitor,redesigner,critics/*}/id.txt.
In Claude Code (primary):
/webster-weekly-council
Or as a single-page prompt (fallback):
wbs @prompts/second-wbs-session.mdBoth run the full planner + fan-out + redesigner + draft PR described above. The skill loads phase references on demand (smaller per-turn context budget); the prompt is one readable file.
bun scripts/critic-genealogy.ts --branch council/$(date -u +%Y-%m-%d)Reads the week's findings, asks Opus 4.7 if any scope is unowned, and spawns + registers + invokes a new critic if yes. Use --fixtures scripts/__tests__/fixtures/genealogy --dry-run to see the flow without making API writes.
Every layer uses Opus 4.7 as author:
| Layer | Opus 4.7 role |
|---|---|
9 agent specs (agents/production/*.json) |
Drafted during bootstrap session, validated against live API |
| Bootstrap + weekly prompts | Opus-authored during dispatcher sessions; in git history |
| Critic Genealogy script | Opus-authored; see dcf5726 + e474301 |
| Redesigner synthesis | Opus 4.7 at runtime — its decision.json outputs live in history/<date>/ |
| Runtime critic spawning | Opus 4.7 selects the gap AND authors the new spec via tool_use |
Repo is entirely MIT. No Anthropic or third-party proprietary code.
MIT. See LICENSE.