Skip to content

SethGammon/Citadel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Citadel — Agent Orchestration for Claude Code

License: MIT Node.js 18+ Claude Code

An agent orchestration system for Claude Code. Route any task through the right tool at the right scale — from a one-line fix to a multi-day parallel campaign.

13 skills | 3 autonomous agents | 8 lifecycle hooks | campaign persistence | fleet coordination | telemetry

Built from running 198 autonomous agents across 32 fleet sessions on a production codebase. 27 postmortems worth of lessons baked into every hook and skill.

The harness is simple. The knowledge that shaped it isn't.

Quickstart

From install to first /do command in 5 minutes.

Prerequisites

1. Copy the harness into your project

macOS / Linux

git clone https://github.com/SethGammon/Citadel.git
cd your-project

cp -r ../Citadel/.claude .
cp -r ../Citadel/.planning .
cp -r ../Citadel/scripts .

# If you don't have a CLAUDE.md yet, copy the starter
cp ../Citadel/CLAUDE.md .

Windows (Command Prompt)

git clone https://github.com/SethGammon/Citadel.git
cd your-project

xcopy /E /I ..\Citadel\.claude .\.claude
xcopy /E /I ..\Citadel\.planning .\.planning
xcopy /E /I ..\Citadel\scripts .\scripts
xcopy ..\Citadel\CLAUDE.md .\CLAUDE.md*

Windows (PowerShell)

git clone https://github.com/SethGammon/Citadel.git
cd your-project

Copy-Item -Recurse ..\Citadel\.claude .\.claude
Copy-Item -Recurse ..\Citadel\.planning .\.planning
Copy-Item -Recurse ..\Citadel\scripts .\scripts
Copy-Item ..\Citadel\CLAUDE.md .\CLAUDE.md

Note: If your project already has a .gitignore, append the entries from the harness .gitignore rather than overwriting yours.

Or copy manually — the harness is just files, no build step.

2. Run setup

Open your project in Claude Code (cd your-project && claude), then:

/do setup

This will:

  • Detect your language and framework
  • Configure the typecheck hook for your stack
  • Generate .claude/harness.json with your settings
  • Create the planning directory structure
  • Run a quick demo on your code

3. Start using it

/do review src/main.ts          # Code review
/do generate tests for utils    # Test generation
/do refactor the auth module    # Safe refactoring
/do scaffold a new API module   # Project-aware scaffolding

Or let the router figure it out:

/do fix the login bug
/do what's wrong with the API
/do build a caching layer

4. Create your first custom skill

/create-skill

It'll ask what patterns you keep repeating and generate a skill file that captures your knowledge permanently.

What's Next

  • Add your project's conventions to CLAUDE.md — the more specific, the better
  • Read docs/SKILLS.md to understand how skills work
  • Try /marshal "audit the codebase" for a multi-step investigation
  • Try /archon "build [large feature]" for multi-session campaigns
  • Try /fleet "overhaul all three modules" for parallel execution

The Orchestration Ladder

Four tiers. Use the cheapest one that fits.

Skill — Domain Expert Marshal — Session Commander
Archon — Autonomous Strategist Fleet — Parallel Coordinator

The /do Router

One command. Say what you want. The system figures out the rest.

/do fix the typo on line 42          → Direct edit (0 tokens)
/do review the auth module           → /review skill
/do build a caching layer            → /marshal (multi-step)
/do finish the API redesign          → /archon (multi-session campaign)
/do overhaul all three services      → /fleet (parallel agents)

Four-tier classification, cheapest first:

  1. Pattern Match (~0 tokens) — Regex catches trivial commands
  2. Active State (~0 tokens) — Resumes in-progress campaigns
  3. Skill Keywords (~0 tokens) — Matches against installed skills
  4. LLM Classifier (~500 tokens) — Structured complexity analysis

The router biases toward under-routing. It's cheaper to re-invoke than to waste 100K tokens on a typo fix.

Commands:

Command What It Does
/do [anything] Route to the right tool
/do status Show active campaigns, sessions, pending work
/do continue Resume where you left off
/do --list Show all installed skills
/do setup First-run configuration

Escape hatches: Direct invocation (/marshal, /archon, /fleet, /review) always bypasses the router.


Built-In Skills (6)

Skill What It Does Invoke
Code Review 5-pass structured review: correctness, security, performance, readability, consistency. Every finding cites a specific line. /review
Test Generation Generates tests that actually run. Detects your test framework, covers happy path + edge cases + error paths. Iterates up to 3x if tests fail. /test-gen
Documentation Three modes: function-level docstrings, module READMEs, API reference. Matches your existing doc style. /doc-gen
Refactoring Safe multi-file refactoring. Typechecks before AND after. If tests fail, reverts and reports. Handles import path updates. /refactor
Scaffolding Project-aware file generation. Reads your existing structure and matches it. Generates wiring, exports, tests. /scaffold
Skill Creator Creates new skills from your patterns. Asks what you keep repeating, what mistakes happen, produces a complete skill file. /create-skill

These are not skeletons. Each produces real, substantive output on any codebase.


Hooks (8 Lifecycle Events)

Automated quality enforcement that runs without you thinking about it.

Hook When What It Does
Per-file typecheck Every edit Catches type errors at write-time, not build-time
Circuit breaker Tool failure After 3 failures: "try a different approach"
Quality gate Session end Scans for anti-patterns in modified files
Intake scanner Session start Reports pending work items
File protection Before edit Blocks edits to protected files
Context preservation Before/after compaction Saves and restores session state
Worktree setup Agent spawn Auto-installs deps in parallel agent worktrees

Language-adaptive: The typecheck hook detects your stack (TypeScript, Python, Go, Rust) and runs the right checker.

Configurable: Add custom quality rules in harness.json. See docs/HOOKS.md.


Campaign Persistence

Work that survives across sessions.

# Campaign: API Auth Overhaul

Status: active
Direction: "Replace basic auth with JWT"

## Phases
1. [complete] Research: audit existing auth
2. [in-progress] Build: JWT middleware
3. [pending] Wire: connect to routes

## Feature Ledger
| Feature | Status | Phase |
|---------|--------|-------|
| JWT middleware | complete | 2 |

## Decision Log
- Chose jose over jsonwebtoken (ESM native, better types)

## Active Context
Building refresh token endpoint. Middleware done.

## Continuation State
Phase: 2, Sub-step: refresh endpoint

Close the session. Come back tomorrow. /do continue picks up exactly where you left off.

See docs/CAMPAIGNS.md and examples/campaign-example.md.


Fleet Parallelism

Run multiple agents simultaneously with discovery sharing.

Wave 1: Agent A (src/api/) + Agent B (src/ui/)
  ← Compress discoveries: "API uses jose for JWT, 15min expiry"
  ← Merge branches

Wave 2: Agent C (integration) ← starts with Wave 1's knowledge
  ← Builds refresh logic knowing the token expiry

Agents run in isolated git worktrees. Dependencies auto-installed. Discovery briefs (~500 tokens each) relay knowledge between waves.

See docs/FLEET.md.


Writing Your Own Skills

The harness ships with 6 skills. You'll want more.

/create-skill

This interviews you about patterns you keep repeating and generates a complete skill file. Every skill you create follows the standard format, making the format the standard by adoption.

Or write one manually — it's just a markdown file with 5 sections:

## Identity      ← Who is this skill?
## Orientation   ← When to use it?
## Protocol      ← Step-by-step instructions
## Quality Gates ← What must be true when done?
## Exit Protocol ← What to output?

See docs/SKILLS.md for the full guide.


Project Structure

.claude/
  settings.json           Hook lifecycle configuration
  harness.json            Project config (generated by /do setup)
  hooks/                  8 lifecycle hooks
  skills/                 Skill protocols (6 built-in + your own)
  agents/                 Agent definitions (archon, fleet, etc.)
  agent-context/          Context injected into sub-agents

.planning/
  intake/                 Work items pending processing
  campaigns/              Active + completed campaign files
  fleet/                  Fleet session state + discovery briefs
  coordination/           Multi-instance scope claims
  telemetry/              Agent run + hook timing logs

scripts/
  coordination.js         Multi-instance coordination CLI
  compress-discovery.cjs  Discovery brief compression
  telemetry-log.cjs       Agent and campaign event logging
  telemetry-report.cjs    Performance summaries

Telemetry & Cost Tracking

The harness logs agent events, hook timing, and discovery compression to .planning/telemetry/ (JSONL format, never leaves your machine).

npm run telemetry:report           # Agent run summary
npm run telemetry:report -- --hooks       # Hook timing averages
npm run telemetry:report -- --compression # Discovery compression ratios

Archon and Fleet log campaign start/complete, wave events, and per-agent results automatically. Hooks log their own timing on every invocation.

Token counts are logged when available. Claude Code doesn't currently surface per-session token usage to hooks, so cost tracking depends on your plan's usage dashboard.


When to Use Citadel

Citadel scales down to a typo fix and up to a multi-day parallel campaign. You don't need to use every tier. Most tasks route to a Skill or Marshal automatically. Archon and Fleet are there when your project grows into them.

If you're just starting with Claude Code and don't have a project yet, start with the basics first and come back when you're ready for structure. If you already have a codebase and want your agent to work smarter — even on simple tasks — install Citadel and let /do handle the routing.


Relationship to Superpowers

Superpowers teaches your agent good methodology — brainstorm before coding, write tests first, review before shipping. Citadel gives it the infrastructure to execute that methodology at scale: campaign persistence across sessions, fleet coordination across parallel agents, lifecycle hooks that enforce quality automatically, and telemetry that tracks what happened. They are complementary. Use Superpowers for the workflow discipline. Use Citadel when your work outgrows a single session.


FAQ

How is this different from just using CLAUDE.md?

CLAUDE.md tells Claude about your project. The harness tells Claude how to work — routing decisions through the right tool, persisting state across sessions, enforcing quality through hooks, and coordinating parallel agents. CLAUDE.md is one piece. The harness is the operating system around it.

How much does this cost in tokens?

Skills cost zero tokens when not loaded — they're on-demand. The /do router costs ~500 tokens only when it needs Tier 3 classification (most requests resolve at Tier 0-2 for free). Hooks add minimal overhead (~100 tokens per edit for typecheck feedback). The main cost is the work itself, which you'd pay regardless.

Can I use this with other AI coding tools?

The harness is designed for Claude Code specifically. The skills, hooks, and agent definitions use Claude Code's extension points. The concepts (campaign files, quality gates, discovery relay) are portable, but the implementation assumes Claude Code.

What's the difference between a skill and an agent?

Skills load instructions into the current Claude session (no new process). Agents spawn a new Claude process with its own context window. Skills are cheap and fast. Agents are expensive but isolated.


License

MIT

Author

Built while managing a 668K-line codebase solo. The harness is the distillation of what actually works when you run agents at scale.

About

Agent orchestration harness for Claude Code. Four-tier routing (/do), campaign persistence across sessions, parallel agents in isolated worktrees, discovery relay between waves, lifecycle hooks, circuit breaker, and 6 production-quality skills. From solo developer to institutional scale.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors