Skip to content

BenjaminPiette/craftswarm

Repository files navigation

CraftSwarm

CraftSwarm

One prompt. One app.
Describe what you want to build. A swarm of AI agents plans, codes, tests, and ships it.


CraftSwarm Dashboard

What It Builds

A few examples of experimental projects built:

3D City Builder game
3D City Builder — Vite + Three.js
3D Farming Simulation
Farming Sim with NPCs — Three.js
Neon 3D Sudoku
Neon 3D Sudoku — Three.js
Google Docs-style Editor
Document Editor — React + Supabase
Racing Game Dashboard
Racing Game — real-time agent activity
Visual Testing & Analysis
Visual testing: agents analyze what they build

Spec tree with milestone tracking
Spec tree — structured requirements with milestone progress tracking

Key Features

  • Multi-agent orchestration — 14 specialized roles with isolated workspaces and automated merge resolution
  • Spec-driven pipeline — structured workflow from specification through bootstrap to development, with checkpoints gating each transition
  • Workspace isolation — each agent works on an isolated copy; a tiered merge pipeline (diff3, tree-sitter validation, LLM-assisted resolution) handles conflicts
  • Continuous auditing — spec-vs-code conformance checks, code quality reviews, infrastructure issue detection, and spec gap analysis run alongside development
  • Testing pipeline — test scripts, test hooks for browser-based visual testing, environment management, and structured test reports
  • Visual testing — agents launch, screenshot, and analyze the applications they build via Gemini vision
  • Innovation system — agents can propose ideas and improvements that feed back into the spec tree
  • Real-time dashboard — monitor agent activity and file writes, chat with the team leader, track merges, review specs/plans/tasks, and inspect agent communications
  • Desktop app — native macOS, Linux, and Windows builds via Tauri

Important before starting

  • This project is a token burner! It comes with a large protocol layer that spends tons of tokens to keep things structured. Experimentation was done with a $50/month Minimax 2.5 Max coding plan, which is (most of the time) enough to keep the engine running 24/7.
  • Don't try this with a per-million-token API key — your bank account won't forgive you.

Architecture

┌─────────────────────────────────────────────────┐
│                  Dashboard (Next.js)             │
│          Real-time UI / WebSocket / REST         │
└──────────────────────┬──────────────────────────┘
                       │
┌──────────────────────┴──────────────────────────┐
│                Backend (Fastify)                 │
│                                                  │
│  ┌────────────┐  ┌───────────┐  ┌────────────┐  │
│  │ Coordinator│  │  Agents   │  │  Services  │  │
│  │ (lifecycle)│  │ (LLM loops│  │ (browser,  │  │
│  │            │  │  + tools) │  │  files,    │  │
│  └────────────┘  └───────────┘  │  embeddings│  │
│                                  └────────────┘  │
│  ┌────────────────────────────────────────────┐  │
│  │         SQLite + Drizzle ORM               │  │
│  │    (projects, agents, specs, plans, tasks)  │  │
│  └────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────┘
Package Description
backend Fastify API server, orchestration engine, agent runtime, Drizzle/SQLite storage
dashboard Next.js management UI with real-time monitoring
web-portal Public-facing landing page
tauri Desktop app wrapper (macOS, Linux, Windows)
shared Shared TypeScript types across packages

Agent Roles

Role Purpose
Team Leader Coordinates agents, dispatches work, communicates with the human
Spec Manager Breaks down requirements into a structured spec tree
Planner Creates implementation plans from specs
Tech Lead Reviews code, manages merges from executors to main
Executor Implements code in an isolated workspace
Tester Runs and writes tests, validates implementations
Code Tester Focused on code-level test coverage and test execution
Code Quality Reviews code for standards, patterns, and issues
Spec Auditor Validates that implementations match their specs
DevOps Manages environments, builds, and deployments
Release Manager Handles release preparation and packaging
Innovation Manager Proposes improvements and new approaches
Test Hooks Manager Manages browser test hooks for visual testing
Test Scripts Manager Manages and maintains test script suites

Getting Started

Prerequisites

  • Node.js >= 20
  • pnpm >= 10
  • An LLM API key (OpenAI, Anthropic, OpenRouter, or compatible)

Install

git clone https://github.com/BenjaminPiette/craftswarm.git
cd craftswarm
pnpm install

Configure

Before starting the engine, open the dashboard settings and configure your API keys:

Provider Required Purpose
LLM (OpenAI-compatible) Yes Agent reasoning — base URL, API key, and model (works with OpenAI, Anthropic, OpenRouter, or any compatible API)
Embedding Yes Semantic search for code, specs, and expertise skills
Gemini Optional Vision — agents use it to analyze screenshots of what they build
Gemini Search Optional Grounded web search for agents that need external information

The dashboard validates each provider on save and shows connection status.

Run

# Browser mode (backend + dashboard + web portal)
pnpm web:dev

# Desktop app (Tauri)
pnpm tauri:dev

Open the dashboard, create a project, write a spec, and start the engine. The agents take it from there.

Tech Stack

Layer Technology
Frontend Next.js 16, React 19, Tailwind CSS 4, Zustand, React Query
Backend Fastify 5, TypeScript 5.9
Database SQLite (better-sqlite3), Drizzle ORM
Vector search sqlite-vec (embeddings)
LLM providers OpenAI, Anthropic Claude, Google Gemini, OpenRouter
Browser automation Playwright
Desktop Tauri
Monorepo pnpm workspaces, Turborepo
Testing Vitest

Development

pnpm turbo typecheck   # type-check all packages
pnpm build             # build all packages
pnpm lint              # lint all packages
pnpm test              # run tests

Database migrations are managed with Drizzle Kit:

cd backend
pnpm db:generate       # generate migration from schema changes
pnpm db:studio         # interactive database browser

Migrations run automatically on server startup.

Status

CraftSwarm is an experimental project that was actively developed for about 3 weeks before slowing down, as the path to a fully functional, production-ready app is still long.

Known Caveats

  • Browser testing / feedback loop — visual testing works but the cycle of launch, screenshot, analyze, fix can still get stuck or loop
  • App deployment — agents build and test locally but there is no deployment management yet
  • Milestones — the milestone/checkpoint system exists but is incomplete and not fully wired end-to-end
  • Spec prioritization — agents don't always pick the most impactful spec items to implement first
  • Manual video recordings — the assisted testing flow with video recording and analysis is partially implemented

License

MIT

About

One prompt, one app. Multi-agent AI engine that builds software projects autonomously.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages