EN | 简体中文
AI coding multi-agent for multi-repo Spec-Driven Development harness. One Source of Truth. N independent repos. M coordinated agents.
Methodology ← Harness Engineering
HarnessXP ← Framework
Specification Gravity ← Core Theory
AI-Native Harness Engineering for Polyrepo & Polyglot Development ← Vision
Modern engineering organizations rarely live in a single repository. Microservices, mobile apps, admin dashboards, and shared libraries each have their own release cadence, their own CI, and their own team. The boundary between them is not a file path — it's an organizational fact.
When you put AI coding agents into this picture, three failure modes appear immediately:
- Cross-repo contract drift. Agent A in the backend repo changes an API field. Agent B in the mobile repo doesn't know. End-to-end tests break in production.
- State that no one owns. Specs live in one repo, code in another, reviews in a third, status in someone's head. No one can answer "what's the state of feature X?" with confidence.
- Phantom merges. A merge command exits 0, but the branch tip is
not actually in
main. The next deploy ships stale code.
HarnessXP is a harness — a set of conventions, scripts, and agent prompts — that makes multi-repo AI coding controllable. It does not invent a new language. It is the operational discipline that wraps your existing repos.
| Single repo (Claude Code native) | Single repo (OpenSpec) | Multi-repo (HarnessXP) | |
|---|---|---|---|
| Repos | 1 | 1 | N (you decide) |
| Spec lives in | CLAUDE.md |
.openspec/ |
One orchestrator repo |
| Agents work in | the repo | the repo | Each in its own repo |
| Cross-repo contracts | n/a | n/a | Single source + sync |
| State machine | ad-hoc | 4-state | 4-state, enforced |
| Release coordination | none | none | Ordered, with gates |
HarnessXP is what you reach for when your "one repo" is actually three.
# 1. Clone the framework
git clone https://github.com/mebusw/HarnessXP.git
cd HarnessXP
# 2. Initialize a new project (in a clean parent dir)
mkdir my-project && cd my-project
../HarnessXP/bin/init \
--name my-project \
--repo backend:../backend:service:node \
--repo web:../web:web:react \
--repo mobile:../mobile:app:flutter
# 3. Add your shared contracts to the orchestrator's shared/ directory
cd orchestrator
# (drop your openapi.yaml / types.ts / config.schema.json here)
# 4. Sync contracts to every business repo
bash .openspec/swarm/sync-shared.sh all
# 5. Start a spec-driven workflow
claude
> /spec USER_AUTH
> /plan USER_AUTH
> /swarm USER_AUTH
> /merge USER_AUTHThat's it. You now have N repos coordinated by one orchestrator, with a 4-state machine, named-status files, and a release order.
✅ Use HarnessXP when:
- You have 2+ repos that need to ship together (or in a known order)
- Different teams own different repos but share an API surface
- You've been bitten by AI agents writing code that drifts from the spec
- You need to know "which repos have unreleased work for feature X?" at any moment
❌ Don't use HarnessXP when:
- You have a single repo — just use
CLAUDE.mdand Claude Code native - Your cross-repo work is rare (less than ~10% of changes) — the overhead isn't worth it
- You have no shared contracts between repos
The shortest answer: HarnessXP is the framework that emerged when OpenSpec couldn't do cross-repo, and Superpowers turned out to be too heavy for everyday coding work.
Started with OpenSpec for migrating and unifying specs across the company.
Worked great inside a single repo. Then we hit the wall: 3 business code repos
that shared API contracts, plus one scheduler / orchestrator agent repo
(swarm-orchestrator). OpenSpec has no concept of cross-repo, so specs drift,
contracts race, and no one can answer "what's the state of feature X?" with
confidence.
With some help from GPT, we built our own cross-repo SDD on top of OpenSpec's spec discipline. Then we kept running into bugs in the AI-generated code — half the time was spent debugging what the executor produced, not on the spec itself. The commands grew, the scripts grew, and what came out the other side was effectively a new openspec-commands layer, designed for cross-repo work from day one.
What we open-source here is just the orchestrator and the cross-repo structure — as an SDD / Harness Engineering framework. The 3 business code repos stay private.
OpenSpec is excellent for a single repo. What it doesn't give you is multi-repo coordination:
- No cross-repo source of truth. The spec lives inside the repo it describes. If 3 repos share an API, you have 3 specs that race.
- No release order. Changes are applied one repo at a time, with no concept of "this repo's change must land before that repo's change."
- No phantom-merge guard.
openspec applydoesn't re-verify that the merge actually committed. - No agent specialization. One generic agent writes everything.
- No macOS bash 3.2 compat. Scripts use bash 4+ features and break on default macOS shells.
For the full comparison, with code-level examples of the cross-repo source
of truth pattern, see docs/why-not-openspec.md.
We tried Superpowers alongside OpenSpec. Two reasons we walked away:
- Too heavy for the work loop. It significantly extends the time to land a task. The brainstorming ceremony is good for design phases, but at the end of brainstorming you should be holding a spec, not another meeting.
- Specs have too many constraints. The constraint density in a
Superpowers spec felt excessive for everyday coding. Combined with
opsx-apply/superpowers:execute-plan, the generated code came out full of bugs that were expensive to understand and debug.
OpenSpec's design was done, but the executor was the weak link. HarnessXP keeps OpenSpec's spec discipline and replaces the executor with smaller, specialized agents.
This repo contains only:
- The orchestrator framework (
swarm-orchestrator) and its commands / scripts - The cross-repo structure (orchestrator + N business repos)
- The 6 agent prompts and 7 slash commands
- Documentation and the 3-repo example
It does NOT contain:
- The 3 business code repos we used to develop against
- The internal contracts of those repos
- Any company-specific configuration
A belief, not a marketing line: large organizations always need customization. The framework is the skeleton; your team fills in the muscle.
| Claude Code native | OpenSpec / spec-kit | SuperPower | HarnessXP | |
|---|---|---|---|---|
| Multi-repo | ❌ | ❌ | ❌ | ✅ |
| SDD conventions | loose | strict (single repo) | light | strict (multi-repo) |
| State machine | ad-hoc | 4-state | 3-state | 4-state, slice-allocated |
| Phantom-merge guard | n/a | n/a | n/a | ✅ (built-in) |
| Cross-repo contract sync | manual | n/a | n/a | ✅ (sync-shared.sh) |
| Release gate | n/a | n/a | n/a | ✅ (RELEASE_GATE=1) |
| Session interrupt recovery | n/a | n/a | n/a | ✅ (SESSION_INTERRUPT) |
| 100% local / no SaaS | ✅ | ✅ | ✅ | ✅ |
| Works without Claude Code | ❌ | ❌ | ✅ (Codex) | ❌ (Claude Code only) |
HarnessXP/
├── bin/ CLI tools (init, install, sanitize, lint)
├── src/ The framework (what gets installed into your orchestrator)
│ ├── .claude/ 7 slash commands + 6 agent prompts
│ ├── .openspec/ 6 shell scripts + 5 templates + 3 registry examples
│ ├── templates/ CLAUDE.md template for business repos
│ └── CLAUDE.md The cross-repo constitution
├── docs/ 9 long-form docs
├── schemas/ JSON schemas for config and registry validation
├── examples/ 3-repo demo (backend + web + mobile)
└── .github/ CI + issue templates
Every task moves through exactly these states, and each transition is owned by exactly one tool:
PROPOSED ──▶ IN_PROGRESS ──▶ REVIEWED ──▶ MERGED
▲ │ │
│ │ ▼
└──────────────┴────────── IN_PROGRESS (rejected → rework)
| State transition | Who pushes it | Where |
|---|---|---|
| (created) → PROPOSED | /plan |
commands/plan.md |
| PROPOSED → IN_PROGRESS | launch-agents.sh |
swarm/launch-agents.sh |
| REVIEWED → IN_PROGRESS (rollback) | /review REQUEST_CHANGES |
commands/review.md |
| IN_PROGRESS → REVIEWED | reviewer agent | agents/reviewer.md |
| REVIEWED → MERGED | merge-all.sh |
swarm/merge-all.sh |
Why slice it? Because in a multi-agent system, the master controller cannot remember the 4 transitions. Each transition is owned by the tool that has the context to do it correctly. This is the lesson from running HarnessXP on 100+ tasks across 3 repos.
If you already have N business repos with working code, read
docs/legacy-migration.md first. The 6 iron
rules:
- Freeze the baseline. Use
/goalto pin "what exists" vs "what's new". - Reverse-engineer the contract.
openapi.yamlis derived from real routes, not from requirements docs. - Sole-owner shared contracts.
shared/has one writer (architect). Everyone else syncs. - Route ≠ implemented. A route in
index.jsdoesn't mean a model method exists. Audit the model layer. - Tag the boundaries. Use OpenAPI
tagsand description to mark "do not touch" — much harder to ignore than CLAUDE.md. - Always do coverage audit. Specs aren't done until you've checked every requirement maps to a spec.
HarnessXP is open source. It is also a real product category: every multi-repo team eventually needs an expert who has been through the traps (phantom merge, contract drift, session interrupt, macOS bash 3.2 incompatibility, naming fallback rot). The framework gives you the discipline; we give you the ramp-up.
Jacky Shen (GitHub · LinkedIn) offers:
- 2-day onsite workshop — your team leaves with HarnessXP running on your real repos
- Architecture review — half-day, your current multi-repo SDD setup
- Hourly troubleshooting — when the swarm goes sideways
📧 Reach out: open an issue with the consulting label, or email
support@HarnessXP.org (placeholder).
The framework is free. The experience is not. That's by design.
PRs welcome. See CONTRIBUTING.md. Read
CODE_OF_CONDUCT.md first.
For large changes, please open a Discussion first — the state machine and the cross-repo write boundary are load-bearing; we'd rather agree on direction than reject a finished PR.
MIT. See LICENSE.