Command Post

The governance layer for multi-agent AI development.

AI agents build at machine speed. Command Post ensures they build the right thing.

Command Post is a specification compliance and lifecycle management system for coordinated AI agent teams. It sits above your execution tools — Claude Code, Codex, Cursor — and solves the problems they don't: agents drifting from requirements, losing accumulated knowledge at context boundaries, and producing work that nobody verifies against the original spec.

Built in TypeScript. Designed for Claude Code agents in tmux sessions. Informed by nuclear command and control principles where execution authority requires verification authority.

The Problem

The AI coding landscape has an execution obsession. Every new tool makes agents faster. Nobody is making them more governed.

The result is predictable. A 2025 study of 567 Claude Code pull requests found that while 83.8% were accepted, 45% of those required human revision before merge — and the most common issues were logic errors, specification mismatches, and architectural violations that automated tests wouldn't catch (Watanabe et al., 2025). CodeRabbit's analysis of 470 real-world PRs found AI-generated code produces 1.7× more defects than human-written code, with logic and correctness issues 75% higher (CodeRabbit, 2025).

These numbers get worse with multi-agent systems. When three agents work in parallel on different features, specification drift compounds. Agent A makes an architectural assumption. Agent B contradicts it. Agent C builds on both. By the time a human reviews the output, the damage is structural.

And there's a second problem nobody talks about: agent mortality. AI agents have finite context windows. When an agent hits capacity, its accumulated knowledge — every architectural decision, every discovered edge case, every piece of reasoning — vanishes. The replacement agent starts fresh, repeating work and making contradictory decisions. In our testing, unmanaged context transitions cause 30–40% repeated work.

Command Post exists because bad requirements amplify at machine speed, and dead agents take institutional knowledge with them.

What Command Post Does Differently

1. Zero-Cost Context Monitoring

Every other multi-agent framework that tracks context usage does it inside the agent, burning the very tokens it's trying to conserve. Command Post takes a fundamentally different approach: an external daemon that reads Claude Code's JSONL transcript files directly from disk.

function getContextUsage(
  transcriptPath: string,
  maxContextTokens: number = 200_000
): ContextUsageReading | null

The daemon polls every 30 seconds, classifies each agent into green (<60%), yellow (60–70%), or red (≥70%) zones, detects usage trends, and triggers the Memory Snapshot Protocol before Claude Code's auto-compaction at ~80% erases the agent's conversation history.

Zero token cost. Zero agent awareness required. The agent just works; the daemon handles mortality.

2. Memory Snapshot Protocol

When an agent approaches context capacity, Command Post executes a 4-step replacement protocol:

1. Daemon → Agent Inbox:  "prepare_handoff"
2. Agent writes snapshot:   decisions, progress, edge cases, file state, next steps
3. Snapshot validated:      quality checklist (17 checks), retry if score < threshold
4. Replacement spawned:     fresh agent with full context inheritance

The critical difference from "just save your work" approaches: Command Post validates snapshot quality. A snapshot that says "working on auth" is worthless. A snapshot that captures which auth approach was chosen, why OAuth was rejected, what edge case was discovered in token refresh — that's continuity.

interface PrdMemorySnapshot {
  agent_id: string;
  task_id: string;
  snapshot_timestamp: string;
  handoff_number: number;
  context_at_snapshot: number;
  state: {
    current_step: string;
    progress_summary: string;
    completion_estimate: string;
  };
  decisions?: Array<{
    decision: string;
    rationale: string;
    impact: string;
  }>;
  gotchas?: string[];
  files_state?: {
    completed: string[];
    in_progress: string[];
    not_started: string[];
  };
  next_steps: string[];
  dependencies_discovered?: string[];
}

If quality is insufficient, the coordinator sends a retry with specific findings about what's missing. After three failed attempts, it force-handoffs with whatever exists — because a partial snapshot beats no snapshot.

The naming convention tells the story: worker-frontend-1 → worker-frontend-1-r1 → worker-frontend-1-r2. Each replacement carries forward everything its predecessors learned.

3. PRD Audit Agents

Most teams verify AI output the same way they verify human output: code review after the fact. Command Post adds a dedicated governance layer — audit agents that continuously check work against the original specification.

Audit agents are read-only. They load the relevant PRD section and the worker's output, compare intent against implementation, and send structured compliance findings to the orchestrator. They don't communicate with workers directly. The orchestrator decides what feedback to relay.

This isn't a linter. A linter checks syntax. An audit agent checks whether the "user authentication" feature actually implements the authentication flow the PRD described, or whether the agent interpreted it as something else entirely.

4. Governance Hierarchy with Full-Capability Agents

Command Post enforces a communication hierarchy modeled on military command structures:

Human Operator
    ↓
Orchestrator (PM/PO)     ← approval gates, task assignment, spawns agents
    ↓           ↓
Workers (N)   Audit Agents
    ↕
  (peers)

Every agent is a full Claude Code instance in its own tmux session with complete tool access. No nested sub-agents sharing a parent's context window, no single points of failure. Each agent owns its own full context window and can be independently monitored, replaced, and lifecycle-managed. The PO can spawn additional workers or audit agents as project needs evolve.

Workers can message peer workers for cross-cutting concerns. Audit agents report only to the orchestrator. The orchestrator is the only role that can message everyone. Human approval gates are non-negotiable for architectural decisions, security choices, and integration points.

The inbox system enforces this at the protocol level — sendMessage() checks role-based permissions before delivery.

Architecture

                ┌───────────────────────┐
                │    HUMAN OPERATOR     │
                │ (Dashboard/CLI/Direct)│
                └───────────┬───────────┘
                            │
                  ┌─────────▼─────────┐
                  │   Orchestrator    │
                  │   (PM/PO, tmux)   │
                  └───┬───────────┬───┘
                      │           │
            ┌─────────▼───┐ ┌────▼──────────┐
            │ Workers (N) │ │ Audit Agents  │
            │ (tmux each) │ │ (tmux each)   │
            └─────────────┘ └───────────────┘

  ┌─────────────────────────────────────────────┐
  │        Context Monitoring Daemon             │
  │  External process — monitors ALL agents      │
  │  Reads JSONL transcripts from disk           │
  │  Zero token cost · Zero agent awareness      │
  └─────────────────────────────────────────────┘

Each agent runs in an independent tmux session. No shared process. No hierarchical dependency chain. If one agent crashes, the others continue. The daemon monitors all of them from outside.

Package Structure

command-post/
├── packages/
│   ├── core/            # Shared types, inbox system, config
│   ├── orchestration/   # Context daemon, snapshots, replacement protocol
│   ├── cli/             # Agent launching, tmux management
│   └── dashboard/       # Next.js UI (roadmap)
└── schemas/             # JSON schemas for validation

Both cli and orchestration depend on core. No circular dependencies. core has zero internal dependencies.

Quick Start

git clone https://github.com/SeanFDZ/command-post.git
cd command-post
pnpm install
pnpm build
pnpm test   # 393 tests across 3 packages

Command Post is a TypeScript library. Import the packages directly:

import { sendMessage, readInbox, writeMessage } from '@command-post/core';
import { ContextMonitorDaemon, ReplacementCoordinator } from '@command-post/orchestration';
import { SessionManager, AgentRunner } from '@command-post/cli';

A standalone CLI (command-post init, command-post launch, command-post daemon start) is on the roadmap.

The Gap This Fills

Category	Examples	What They Solve	What They Don't
Execution engines	Codex, Cursor, Claude Code	Running agents fast	Specification compliance
Orchestration frameworks	CrewAI, LangGraph, AutoGen	Agent coordination	Context lifecycle management
Context management	OpenAI Sessions, Google ADK	In-session memory	Zero-cost external monitoring
Action-level safety	Agent Gate, SentinelGate	What an agent can do	What an agent should build
Command Post	—	Governance + lifecycle + compliance	—

Command Post is not a replacement for any of these tools. It's the layer above them. Your agents still execute in Claude Code. Your actions are still gated by Agent Gate. Command Post governs what gets built and whether it matches the spec.

What's Included

Context Monitoring Daemon — external JSONL transcript parser, zone classification, trend detection
Memory Snapshot Protocol — 4-step replacement flow with quality validation and retry logic
PRD Audit Agents — read-only specification compliance checking with structured findings
Governance Hierarchy — Orchestrator (PO/PM) with authority to spawn agents as needed, human approval gates
Inbox Messaging — file-based inter-agent communication with role-based permissions
Tmux Session Management — every agent runs as a full Claude Code instance with complete tool access, no capability-limited sub-agents

Roadmap

Standalone CLI tool
Next.js dashboard with real-time agent status
Kanban board for task visualization

Known Limitations

Claude Code specific. The context daemon reads Claude Code's JSONL transcript format. Other agent runtimes would need adapters.
File-based messaging. The inbox system uses JSON files on disk. This is intentional (crash recovery, auditability, simplicity) but won't scale to hundreds of concurrent agents without a message broker.
Snapshot quality depends on the agent. The validation catches structural problems (missing fields, empty next_steps) but can't evaluate whether the content of a snapshot is accurate. A well-structured lie passes validation.
Not a security boundary. Command Post governs specification compliance, not adversarial behavior. For action-level security, use Agent Gate or equivalent tooling.
Single-machine architecture. Agents, daemon, and inbox all assume shared filesystem access. Distributed deployment is a future concern.

Related: Agent Gate

Agent Gate is the execution authority layer for individual AI agents — vault-backed rollback, policy enforcement, and action classification before any destructive operation proceeds.

Command Post governs what agents build. Agent Gate governs what agents are allowed to do. They address different layers of the same problem:

Layer	Project	Controls
Specification Authority	Command Post	PRD compliance, audit agents, governance hierarchy
Action Authority	Agent Gate	Vault-backed rollback, directory enforcement, policy classification

The Command & Control Analogy

The name isn't metaphorical. It's architectural.

In nuclear command and control, every action requires verified authorization through a chain of command. Permissive Action Links ensure weapons can't be used without proper codes. Looking Glass airborne command posts ensure continuity of authority even when ground stations are destroyed.

Command Post applies the same principles to AI agent teams:

Permissive Action Links → Agent Gate's policy enforcement (no action without authorization)
Chain of Command → Governance hierarchy with human approval gates
Looking Glass Continuity → Memory Snapshot Protocol (authority survives agent replacement)
Two-Person Rule → Audit agents verifying worker output against specifications

The nuclear C2 system doesn't trust any single actor. Neither does Command Post.

Author

Sean Lavigne — GitHub · LinkedIn

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
examples		examples
packages		packages
schemas		schemas
.gitignore		.gitignore
.npmrc		.npmrc
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Command Post

The Problem

What Command Post Does Differently

1. Zero-Cost Context Monitoring

2. Memory Snapshot Protocol

3. PRD Audit Agents

4. Governance Hierarchy with Full-Capability Agents

Architecture

Package Structure

Quick Start

The Gap This Fills

What's Included

Roadmap

Known Limitations

Related: Agent Gate

The Command & Control Analogy

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Command Post

The Problem

What Command Post Does Differently

1. Zero-Cost Context Monitoring

2. Memory Snapshot Protocol

3. PRD Audit Agents

4. Governance Hierarchy with Full-Capability Agents

Architecture

Package Structure

Quick Start

The Gap This Fills

What's Included

Roadmap

Known Limitations

Related: Agent Gate

The Command & Control Analogy

Author

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages