CraftSwarm

One prompt. One app.
_{Describe what you want to build. A swarm of AI agents plans, codes, tests, and ships it.}

What It Builds

A few examples of experimental projects built:

_{3D City Builder — Vite + Three.js}	_{Farming Sim with NPCs — Three.js}
_{Neon 3D Sudoku — Three.js}	_{Document Editor — React + Supabase}
_{Racing Game — real-time agent activity}	_{Visual testing: agents analyze what they build}

_{Spec tree — structured requirements with milestone progress tracking}

Key Features

Multi-agent orchestration — 14 specialized roles with isolated workspaces and automated merge resolution
Spec-driven pipeline — structured workflow from specification through bootstrap to development, with checkpoints gating each transition
Workspace isolation — each agent works on an isolated copy; a tiered merge pipeline (diff3, tree-sitter validation, LLM-assisted resolution) handles conflicts
Continuous auditing — spec-vs-code conformance checks, code quality reviews, infrastructure issue detection, and spec gap analysis run alongside development
Testing pipeline — test scripts, test hooks for browser-based visual testing, environment management, and structured test reports
Visual testing — agents launch, screenshot, and analyze the applications they build via Gemini vision
Innovation system — agents can propose ideas and improvements that feed back into the spec tree
Real-time dashboard — monitor agent activity and file writes, chat with the team leader, track merges, review specs/plans/tasks, and inspect agent communications
Desktop app — native macOS, Linux, and Windows builds via Tauri

Important before starting

This project is a token burner! It comes with a large protocol layer that spends tons of tokens to keep things structured. Experimentation was done with a $50/month Minimax 2.5 Max coding plan, which is (most of the time) enough to keep the engine running 24/7.
Don't try this with a per-million-token API key — your bank account won't forgive you.

Architecture

┌─────────────────────────────────────────────────┐
│                  Dashboard (Next.js)             │
│          Real-time UI / WebSocket / REST         │
└──────────────────────┬──────────────────────────┘
                       │
┌──────────────────────┴──────────────────────────┐
│                Backend (Fastify)                 │
│                                                  │
│  ┌────────────┐  ┌───────────┐  ┌────────────┐  │
│  │ Coordinator│  │  Agents   │  │  Services  │  │
│  │ (lifecycle)│  │ (LLM loops│  │ (browser,  │  │
│  │            │  │  + tools) │  │  files,    │  │
│  └────────────┘  └───────────┘  │  embeddings│  │
│                                  └────────────┘  │
│  ┌────────────────────────────────────────────┐  │
│  │         SQLite + Drizzle ORM               │  │
│  │    (projects, agents, specs, plans, tasks)  │  │
│  └────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────┘

Package	Description
`backend`	Fastify API server, orchestration engine, agent runtime, Drizzle/SQLite storage
`dashboard`	Next.js management UI with real-time monitoring
`web-portal`	Public-facing landing page
`tauri`	Desktop app wrapper (macOS, Linux, Windows)
`shared`	Shared TypeScript types across packages

Agent Roles

Role	Purpose
Team Leader	Coordinates agents, dispatches work, communicates with the human
Spec Manager	Breaks down requirements into a structured spec tree
Planner	Creates implementation plans from specs
Tech Lead	Reviews code, manages merges from executors to main
Executor	Implements code in an isolated workspace
Tester	Runs and writes tests, validates implementations
Code Tester	Focused on code-level test coverage and test execution
Code Quality	Reviews code for standards, patterns, and issues
Spec Auditor	Validates that implementations match their specs
DevOps	Manages environments, builds, and deployments
Release Manager	Handles release preparation and packaging
Innovation Manager	Proposes improvements and new approaches
Test Hooks Manager	Manages browser test hooks for visual testing
Test Scripts Manager	Manages and maintains test script suites

Getting Started

Prerequisites

Node.js >= 20
pnpm >= 10
An LLM API key (OpenAI, Anthropic, OpenRouter, or compatible)

Install

git clone https://github.com/BenjaminPiette/craftswarm.git
cd craftswarm
pnpm install

Configure

Before starting the engine, open the dashboard settings and configure your API keys:

Provider	Required	Purpose
LLM (OpenAI-compatible)	Yes	Agent reasoning — base URL, API key, and model (works with OpenAI, Anthropic, OpenRouter, or any compatible API)
Embedding	Yes	Semantic search for code, specs, and expertise skills
Gemini	Optional	Vision — agents use it to analyze screenshots of what they build
Gemini Search	Optional	Grounded web search for agents that need external information

The dashboard validates each provider on save and shows connection status.

Run

# Browser mode (backend + dashboard + web portal)
pnpm web:dev

# Desktop app (Tauri)
pnpm tauri:dev

Open the dashboard, create a project, write a spec, and start the engine. The agents take it from there.

Tech Stack

Layer	Technology
Frontend	Next.js 16, React 19, Tailwind CSS 4, Zustand, React Query
Backend	Fastify 5, TypeScript 5.9
Database	SQLite (better-sqlite3), Drizzle ORM
Vector search	sqlite-vec (embeddings)
LLM providers	OpenAI, Anthropic Claude, Google Gemini, OpenRouter
Browser automation	Playwright
Desktop	Tauri
Monorepo	pnpm workspaces, Turborepo
Testing	Vitest

Development

pnpm turbo typecheck   # type-check all packages
pnpm build             # build all packages
pnpm lint              # lint all packages
pnpm test              # run tests

Database migrations are managed with Drizzle Kit:

cd backend
pnpm db:generate       # generate migration from schema changes
pnpm db:studio         # interactive database browser

Migrations run automatically on server startup.

Status

CraftSwarm is an experimental project that was actively developed for about 3 weeks before slowing down, as the path to a fully functional, production-ready app is still long.

Known Caveats

Browser testing / feedback loop — visual testing works but the cycle of launch, screenshot, analyze, fix can still get stuck or loop
App deployment — agents build and test locally but there is no deployment management yet
Milestones — the milestone/checkpoint system exists but is incomplete and not fully wired end-to-end
Spec prioritization — agents don't always pick the most impactful spec items to implement first
Manual video recordings — the assisted testing flow with video recording and analysis is partially implemented

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1,124 Commits
.claude		.claude
.github/workflows		.github/workflows
.husky		.husky
backend		backend
dashboard		dashboard
docs/images		docs/images
shared		shared
tauri		tauri
web-portal		web-portal
.gitignore		.gitignore
.npmrc		.npmrc
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
TODO.txt		TODO.txt
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CraftSwarm

What It Builds

Key Features

Important before starting

Architecture

Agent Roles

Getting Started

Prerequisites

Install

Configure

Run

Tech Stack

Development

Status

Known Caveats

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CraftSwarm

What It Builds

Key Features

Important before starting

Architecture

Agent Roles

Getting Started

Prerequisites

Install

Configure

Run

Tech Stack

Development

Status

Known Caveats

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages