braid

Status: v0.1 — first build stage, actively evolving. The end-to-end pipeline (plan → spawn → worker → report → critic → merge-queue with post-merge verify) has been validated with a single worker on a real repository. Running more workers in parallel is supported by the design but not yet stress-tested — see ROADMAP.md.

Expect frequent, visible changes. braid is being built in the open, iteratively, against real use. Design decisions get revised as empirical feedback comes in. Every change is tracked in git — git log is the honest record of what moved when and why. If you pin a specific behavior in your own workflow, pin to a commit/tag rather than to main.

A combination of existing SOTA patterns from open-source coding agent tools, packaged as a small auditable bash harness.

A capable planner model (in v0.1: Claude Opus via Claude Code) acts as orchestrator — planning, decomposing, reviewing. A separate CLI worker (in v0.1: OpenAI Codex) executes under strict contracts. Each worker runs in an isolated git worktree, produces a structured report, and the orchestrator independently verifies the output before a serial merge-queue integrates it with a post-merge-verify safety net.

Think of it as a pair-programming pattern where one agent plans the work, a second agent executes under a binding contract, and a merge-queue enforces integration safety. Multiple strands, one fabric.

Quickstart (for developers in a hurry)

# 1. Clone and link the CLI
git clone https://github.com/GeOhDoubleT/braid.git
cd braid
export PATH="$PWD/bin:$PATH"

# 2. Set up a target repo (clones if URL, validates if path)
braid init https://github.com/you/your-project.git

# 3. Activate the environment + check your tooling
source .env.current
braid doctor

# 4. Try the built-in demo (self-contained Python sandbox)
braid init-demo
braid validate fix-increment

# 5. From a claude-code session in this directory, run /braid-task

What problem does this solve?

Parallel AI coding agents share two persistent failure modes:

Silent scope creep — agents touch files they shouldn't, quietly refactor things they noticed
Uncritical self-reporting — agents claim COMPLETED when tests actually failed; integration breaks later

braid addresses both by separating planning from execution and enforcing independent verification:

A Sprint Contract specifies exactly what files a worker may touch, what tests must pass, what it must NOT do
The worker runs in a git worktree, sees only its own contract
The worker's report is not trusted — the orchestrator re-runs tests itself
Merges go through a serial queue with post-merge verification and automatic rollback on conflict

braid does not invent new ideas. It combines patterns you can find elsewhere (see Limitations and prior art) into one opinionated harness, and keeps the implementation small (~500 lines of bash) so you can read, trust, and modify it.

You describe intent — the planner does the rest

You don't need to write structured prompts. Inside a Claude Code session with braid, two slash commands cover the full workflow:

/braid-setup — interactive setup for a new target repo. Claude asks which repo, runs braid init, verifies the environment, installs the worker skill. You just answer the questions.

/braid-task — end-to-end dispatch of a feature or change. You say what you want in one sentence. The command instructs Claude to:

Clarify (max 3 short questions) only if the intent is ambiguous
Explore the target repo first — read the README, scan directory structure, find similar existing features, detect test conventions
Propose 2–4 atomic tasks as a first slice, with an explicit note of what's deferred to later slices
Write contracts once you pick which tasks to dispatch first
Brief per contract — goal, allowed paths, test commands, budget — and wait for your dispatch approval
Spawn, verify, and merge — the worker runs, Claude re-runs the tests itself (independent critic), you approve the merge

Example:

/braid-task

Feature: add a "Leads" entity to our CRM, similar to how Opportunities work.

That's enough. Claude will explore the repo, check whether Leads already exists, propose a decomposition (e.g. "Task 1: entity + migration, Task 2: GraphQL resolver, Task 3: seed data"), and ask which to dispatch first. You never write a YAML contract by hand unless you want to.

What braid does well

Strict scope isolation. allowed_paths / forbidden_paths deny-lists are enforced at planning time and re-checked at verify time.
Independent post-hoc verification. The orchestrator runs tests itself against the worker's actual diff — no "trust me bro" from the worker.
Serial merges with rollback. The merge-queue rebases, squash-merges, and re-runs tests against the merged state. If post-merge verify fails (intent-conflict with a parallel task), it rolls back and generates a rework contract.
Worker-infra cleanliness. The contract and discipline files (AGENTS.md, .sprint-contract.yaml) never leak into the target repo.
One-file-CLI. bin/braid is a single bash script plus bin/braid-merge-queue. Read it in an afternoon.
Two-agent asymmetry. Use the expensive planner model only for planning/review. Workers can be cheap models on fast profiles.

Status — v0.1 (first build stage)

braid is in its first build stage. The full end-to-end pipeline — plan, spawn, worker execution, independent critic, merge queue with post-merge verify and rollback — has been validated with a single worker on a real git repository. Running multiple workers in parallel is supported by the design (bash spawns are cheap and worktrees are isolated), but has not been stress-tested for orthogonal-scope availability, API rate-limit behavior, merge-queue throughput, or observability with many workers. Scaling work is an explicit next step in the roadmap — see ROADMAP.md.

Issues and PRs welcome for anything that breaks or feels wrong.

What braid is NOT (today)

Not a multi-vendor adapter. Workers are hard-wired to the OpenAI Codex CLI. Extending to Gemini/Claude workers is feasible but not shipped.
Not a hosted service. Everything runs on your machine (or WSL). No server, no daemon.
Not a replacement for Claude Code Agent Teams. Agent Teams coordinate Claude-to-Claude with built-in mailbox. braid is Claude-plans / Codex-executes, which Agent Teams does not support.
No cross-worker context sharing. Workers are deliberately isolated. If task B depends on task A's output, serialize them — don't parallelize. Cross-worker gossip is an anti-pattern.
Single-lane merge queue. No dependency graph between tasks (yet). Parallel tasks must have orthogonal scopes.
No observability layer. Status is read from filesystem (braid status) and tmux panes. Usable for ~3 workers, unclear above that — the ROADMAP treats this as an open question, not a commitment.

Architecture at a glance

                     ┌──────────────────────────────┐
                     │  Orchestrator (planner LLM)  │
                     │  — reads CLAUDE.md           │
                     │  — writes Sprint Contracts   │
                     │  — runs Critic step          │
                     └──────────────┬───────────────┘
                                    │
                       braid spawn <task-id>
                                    │
                     ┌──────────────▼──────────────┐
                     │ git worktree (isolated)     │
                     │  + .sprint-contract.yaml    │
                     │  + AGENTS.md                │
                     │  + .git/info/exclude        │
                     └──────────────┬──────────────┘
                                    │
                              codex exec
                                    │
                     ┌──────────────▼──────────────┐
                     │ Worker (Codex CLI)          │
                     │  — reads contract           │
                     │  — runs tests, edits files  │
                     │  — writes atomic report     │
                     └──────────────┬──────────────┘
                                    │
                    reports/<task-id>.md + .exit
                                    │
                     ┌──────────────▼──────────────┐
                     │ Orchestrator: Critic        │
                     │  — re-runs test_commands    │
                     │  — checks scope compliance  │
                     │  — PASS or Rework contract  │
                     └──────────────┬──────────────┘
                                    │
                         braid merge <task-id>
                                    │
                     ┌──────────────▼──────────────┐
                     │ Merge Queue (serial lock)   │
                     │  1. rebase on target        │
                     │  2. squash merge            │
                     │  3. re-run tests on merged  │
                     │     state (post-merge)      │
                     │  4. rollback on failure     │
                     │  5. archive + cleanup       │
                     └─────────────────────────────┘

More detail: docs/ARCHITECTURE.md.

Requirements

Component	Purpose	Install
bash 4+	CLI is pure bash	preinstalled on Linux/macOS/WSL
git 2.30+	worktree operations	`apt install git` / `brew install git`
tmux (recommended)	Multi-pane parallel workers	`apt install tmux` / `brew install tmux`
OpenAI Codex CLI	Worker executor	`npm install -g @openai/codex`
A planner LLM	Orchestrator (e.g. Claude Code)	per vendor install

braid is developed and tested on Linux (Ubuntu), macOS, and Windows-via-WSL2. Native Windows without WSL is not supported — tmux and POSIX worktree semantics are required.

Docs

New to this? Start with docs/GETTING_STARTED.md — from fresh WSL install to a first passing smoke test.
Know your tools? docs/ARCHITECTURE.md for the design, docs/CONTRIBUTING.md for how to extend.
Why these choices? docs/FOUNDATIONS.md maps every architectural pattern in braid to its academic / industry source.
What do real runs look like? docs/FIELD_NOTES.md is a running log of observations from actual sessions — token distributions, self-decomposition events, apparent-hang phenomena, etc. n=1 per entry, not statistics, but honest.

What braid is (and is not)

braid is a combination of existing patterns from open-source coding agent tools and engineering blogs. Nothing here is novel. The contribution — if any — is that these patterns usually show up separately, and packaging them into one small auditable bash harness was missing. braid is an empirical probe, not a proven solution.

Patterns braid uses (and where they come from)

Planner/Executor split — a capable model plans, a cheaper CLI worker executes. Popularized by Anthropic's multi-agent research writeup and Factory AI's Droids. Pragmatic cost report in BSWEN's planner-executor guide.
Contract-based handoff (YAML) — structured task specs rather than dialogue between agents. Used by agent-mux, EveryInc's compound-engineering-plugin, and codex-bmad-skills.
Independent post-hoc verification — the planner re-runs tests rather than trusting worker claims. Convergent choice across Sourcegraph Amp's subagent design and cross-model review patterns.
Strict scope deny-lists — explicit forbidden_paths per task. Related philosophy: the "Commands > Prose" rule from Stack Overflow's coding guidelines for AI agents.
Post-merge verify + rollback — run tests on merged state, revert on failure. Articulated by ctx.rs, Why Coding Agents Need a Merge Queue.
Worktree isolation, no cross-worker communication — each worker runs in its own git worktree. Used by oh-my-claudecode (tmux multi-agent), ccswarm, parallel-code. Empirical defense of bounded parallelism in Simon Willison's "parallel coding agents lifestyle".
Counter-evidence worth knowing: Cognition's "Don't Build Multi-Agents" warns that parallel subagents produce contradictory implicit decisions. braid's response is contracts as explicit coordination — no agent-to-agent improvisation.
AGENTS.md convention — shared planner/worker instruction file across tools. See agents.md (Linux Foundation).

Where braid sits vs. adjacent OSS projects

Each listed project overlaps with braid in at least one column; none combines all of them.

Project	Planner/Executor	Contract (YAML)	File-level deny-list	Independent Critic	Post-merge verify + rollback	Worktree isolation
braid	✓	✓	✓	✓	✓	✓
agent-mux	✓	✓ (JSON)	partial	—	—	—
ComposioHQ/agent-orchestrator	✓	~	—	—	—	✓
nwiizo/ccswarm	— (Claude-only)	—	—	—	—	✓
EveryInc/compound-engineering-plugin	✓	✓	—	—	—	—
xmm/codex-bmad-skills	—	✓	—	—	—	—
oh-my-claudecode	—	—	—	—	—	— (tmux only)
Claude Code Agent Teams	✓ (Claude↔Claude)	—	—	—	—	✓

Pick a different tool if you need any of:

Cross-vendor workers (Gemini, Claude-as-worker, local LLMs) — see CrewAI, LangGraph
UI-driven review and approval — see Cognition's Devin, Sourcegraph Amp
Dependency graphs between tasks — see LangGraph or Prefect
Claude-to-Claude coordination — use Claude Code Agent Teams
Mass IaC generation — dedicated frameworks exist for that niche

What we think braid could show empirically

These are measurement opportunities anyone can run with the harness as-is:

Scope-adherence rate — on a fixed task set, measure how often deny-list files are touched with vs. without the deny-list mechanism
Cost-per-accepted-merge — cross-vendor (Opus-plan + Codex-execute) vs. homogeneous (Claude-only, Codex-only) on identical tasks
Post-merge-verify catch rate — how often does post-merge verify reject a merge that looked clean at pre-merge verify?
Parallelism sweet spot — empirical curve of accepted merges per hour vs. parallelism, on real repos
Reasoning-effort elasticity — for a task class, what's the quality/cost frontier across minimal|low|medium|high|xhigh?

None of these are promised. None have been run yet. The harness is small enough that anyone can produce the numbers.

License

MIT — see LICENSE. Open source first. Pull requests welcome, feature-request issues encouraged.

Author

Created by @GeOhDoubleT, 2026. Inspired by public patterns in the multi-agent coding tools space; see Limitations and prior art for attribution.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.claude/commands		.claude/commands
bin		bin
config		config
docs		docs
examples		examples
memory		memory
skills		skills
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

braid

Quickstart (for developers in a hurry)

What problem does this solve?

You describe intent — the planner does the rest

What braid does well

Status — v0.1 (first build stage)

What braid is NOT (today)

Architecture at a glance

Requirements

Docs

What braid is (and is not)

Patterns braid uses (and where they come from)

Where braid sits vs. adjacent OSS projects

What we think braid could show empirically

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

braid

Quickstart (for developers in a hurry)

What problem does this solve?

You describe intent — the planner does the rest

What braid does well

Status — v0.1 (first build stage)

What braid is NOT (today)

Architecture at a glance

Requirements

Docs

What braid is (and is not)

Patterns braid uses (and where they come from)

Where braid sits vs. adjacent OSS projects

What we think braid could show empirically

License

Author

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages