Skip to content

Agent-orchestration playbook for v0.20.0 (parallel coding agents) #899

@kovtcharov

Description

@kovtcharov

Goal

Document the playbook for coordinating parallel coding agents on the v0.20.0 consumer launch — so the team can replicate the pattern reliably across releases instead of reinventing coordination per cycle.

Why this matters

v0.20.0 has 11 consumer-critical issues + PR #606 + cross-cutting v0.18.2/v0.19.0 dependencies. With coding agents executing the work, coordination becomes the dominant cost — typing speed isn't the bottleneck anymore. Without explicit orchestration, parallel agents create merge conflicts, divergent patterns, and incoherent UX. With it, GAIA can ship a 12-item consumer launch in 4 weeks that humans alone couldn't match in 12.

Scope

Write docs/playbooks/agent-orchestration.mdx covering:

1. The orchestrator pattern

  • One human / one orchestrator agent breaks work into pieces
  • Parallel implementer agents own one issue each
  • Integrator agent (or human) combines + validates
  • Reference implementation: how Claudia MCP tools (claudia_create_task, claudia_get_task_status) are used to dispatch and monitor

2. When to parallelize vs serialize

  • Parallelize when: file trees are disjoint, no architectural decision is shared, no cross-cutting state
  • Serialize when: same file tree, shared design-system patterns, sequential dependencies (e.g. M1 PWA shell must precede M4 Web Push)
  • Concrete rule: use claudia_list_tasks before spawning parallel work to confirm no overlap

3. Stop-the-line rules

4. Spec-before-PR rule

5. Review gates

  • Every agent-authored PR runs through code-reviewer agent (.claude/agents/code-reviewer.md) before human review
  • Architecture-touching PRs additionally run architecture-reviewer
  • claude.yml workflow Opus reviewer is the final gate
  • Human review focuses on: integration, UX coherence, "does this actually solve the problem"

6. Integration validation

  • Before any release ships, run the full consumer journey end-to-end (A4 for v0.20.0)
  • "Consumer journey" = installable artifact → first-run → 3 starter skills work → privacy verifier passes → Telegram delivery works
  • Block release if any leg fails

7. Tracking + observability

  • All consumer-critical issues carry consumer-critical label (filterable in GitHub)
  • Each release has a milestone with explicit ship-date confidence
  • Each parallel agent task is visible via claudia_list_tasks to prevent duplicate work
  • Daily standups are: orchestrator reads claudia_list_tasks + open PR list + flags blockers

8. Anti-patterns to avoid

  • Parallel agents on the same file tree without explicit serialization
  • Agents writing tests for code they wrote (calibration risk — separate test-engineer agent or human)
  • Skipping the spec when "the issue body is enough" — it's never enough for consumer-critical
  • Assuming code-reviewer catches design issues — it catches code issues; design needs human or architecture-reviewer

Deliverables

  • docs/playbooks/agent-orchestration.mdx written
  • Linked from CLAUDE.md and AGENTS.md
  • Linked from docs/docs.json navigation under a new "Playbooks" section
  • Postmortem section template — record what worked / didn't after v0.20.0 ships

Acceptance criteria

  • A new contributor can read the playbook + spawn coordinated agent work on a feature without asking the orchestrator
  • The v0.20.0 retrospective updates the playbook with concrete lessons

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    consumerBlocks consumer adoption — must ship for the v0.20.0 consumer launch windowdevopsDevOps/infrastructure changesdocumentationDocumentation changesdomain:qualityTests, CI/CD, security, performance, evalsp1medium prioritytrack:platformFoundation that both consumer-app and oem-pc tracks consume

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions