Above: Gate H1 gallery — 26 advocate-generated mockups, side-by-side, in your browser. You pick the one that matches your intent. The rest of the pipeline (spec, tests, deploy) is driven by that pick.
TDD drove code with tests. SpecDD drove code with specs.
We put PreviewDD in front. Mockup-first, eyes-first decision-making —
143 Opus 4.7 agents turn one line of idea into a frozen full-stack app
with only two human clicks.
Preview Forge is a Claude Code plugin submitted to the Built with Opus 4.7 hackathon (April 21–28, 2026, Anthropic × Cerebral Valley).
It encodes a new software-development methodology — 3-DD — as a 143-agent virtual engineering organization that runs entirely inside Claude Code, with only Anthropic-native dependencies (no Figma, no external CDN, no third-party SaaS). One line of idea in. One frozen full-stack app out. Two human clicks.
| Cycle | Stages | Driven by | Locked artifact |
|---|---|---|---|
| ① PreviewDD (new) | 1–3 | 26 mockups diverge direction before any spec | chosen_preview.json + mockups/chosen.html |
| 🔒 Gate H1 (human) | — | Claude Design (main) / built-in Studio (fallback) | design-approved.json |
| ② SpecDD | 4–5 | OpenAPI spec drives implementation (nestia) | specs/openapi.yaml + SHA-256 .lock |
| ③ TestDD | 6–7 | Tests + scoreboard drive freeze (≥499/500) | score/report.json + .frozen-hash |
| 🚀 Gate H2 (human) | — | Deployment approval | Deployed URL or tarball |
All three cycles follow the diverge → aggregate → lock shape. Full specification (v8.0) — 2,100+ lines, single HTML file, print-friendly.
# 1. Add this marketplace
/plugin marketplace add Two-Weeks-Team/PreviewForgeForClaudeCode
# 2. Install the plugin
/plugin install pf@two-weeks-team
# 3. Reload
/reload-plugins
# 4. Initialize memory + workspace permissions (first time per workspace)
/pf:bootstrap
# v1.5.2+: also seeds .claude/settings.local.json to suppress plugin Bash
# permission prompts. In the normal non-escalation path, /pf:new then reaches
# only the two human gates (H1 design select, H2 ship). Profile escalation
# may add a third AskUserQuestion if a HARD_REQUIRE signal triggers.
# Without this seed, Claude Code prompts for every new Bash pattern
# (mkdir, pnpm, npx, ...).
# 5. Run (profile defaults to `standard` as of v1.4.0)
/pf:new "한 줄 아이디어"
# …or pick a profile explicitly:
/pf:new "demo-class idea" --profile=standard # default — ~60k tok · 2×5 eng · 9 previews · SQLite · no Docker
/pf:new "real project" --profile=pro # ~250k tok · 3×5 eng · 18 previews · Postgres + Docker
/pf:new "production launch" --profile=max # ~600k tok · 5×5 eng · 26 previews · full CI/CD| Profile | Previews | Eng teams | DB | Container | Panels | SCC iter | P95 ceiling | Use for |
|---|---|---|---|---|---|---|---|---|
| standard (default) | 9 | 2×5 (BE+FE) | SQLite | ❌ none | keyword-trigger | 3 | ~60k tok / 25 min | Local MVP · demo · prototyping |
| pro | 18 | 3×5 (+DB) | Postgres (dev-prod parity) | Docker + compose | keyword-trigger + escalation | 4 | ~250k tok / 70 min | Real projects |
| max | 26 | 5×5 (all) | Postgres | Docker + CI/CD | always-on | 5 | ~600k tok / 160 min | Production · baselines |
--previews=Noverrides the count (bounded bymax_user_expand= 26).--no-cachebypasses the PreviewDD-level cache (7 days for standard/pro, never cached for max).- Standard = local-first:
npm install && npm run db:push && npm run dev— no Docker, no Postgres setup. DB lives at~/.preview-forge/<project>/dev.db(outside repo tree for security). - Upgrade path: standard → pro via
bash scripts/graduate.sh pro(additive; keeps your code, adds Dockerfile/compose/Postgres datasource). - Full spec:
plugins/preview-forge/profiles/.
When you run standard but your idea mentions enterprise signals (Stripe, PII, HIPAA, SSO provider, SOC2, multi-tenant), the plugin recommends the right profile before PreviewDD burns tokens:
Evaluation precedence (highest wins):
- Hard-require (Stripe / PII / HIPAA / auth-provider): any single hit forces upgrade. You cannot dismiss — false assurance is worse than friction. The
min_distinct_categories=2floor does NOT apply here. - Soft-suggest + category-floor (SOC2 / multi-tenant / B2B / scale): needs ≥2 distinct categories AND score ≥ threshold to ask via AskUserQuestion. Records your answer in
~/.preview-forge/escalation-history.json. If you decline, same signals won't re-prompt within 24h (anti-nagging). - Hint (weak signals, score < threshold but ≥ min-floor): shows "💡 Consider --profile=pro next time" in
/pf:status, no interruption.
Categorical scoring (not raw keyword count) means "audit logging feature" in a generic marketing copy app won't false-positive — it's one category, below the 2-category floor.
- Rule 9 idea-drift detector (
hooks/idea-drift-detector.py) catches the failure where Gate H1 picks product A but SpecDD/Engineering drift to product B. Containment coefficient over token sets (no external ML deps). Block threshold 0.3, warn at 0.4. - P0-B cost-regression sentinel (
hooks/cost-regression.py) comparescost-snapshot.jsonagainst the active profile's P95/hard ceiling every 30s. Hard breach triggers auto-pause + AskUserQuestion handoff.
Terminology: "v1.6 audit" / "v1.7 audit" are feature umbrella names (issue #28 family / #29–#37). Each PR within an umbrella ships under its own release-please semver tag based on Conventional Commits — so the v1.6 schema landed in semver v1.6.0, B-1/B-3/A-4 (Phase 9, PR #51) landed in semver v1.10.0, etc. See CHANGELOG.md for the per-tag mapping.
The biggest UX shift since v1.0.0: the gallery comes first, before any spec or code is written. You don't pick a feature flag matrix; you pick a picture.
Before v1.6, 26 Advocates dispatched directly from the one-liner — and the failure mode in LESSON 0.7 played out: user wrote "회의록 자동 정리," panel-recommended composite #1 was a Slack bot, but the user actually wanted a legal deposition paralegal tool. Different product entirely.
v1.6 adds I1 Idea Clarifier between /pf:new and the 26 advocates. Three batched AskUserQuestion modals (10–12 fields total) produce idea.spec.json — a structured ground truth (target_persona, primary_surface, jobs_to_be_done, killer_feature, must_have_constraints, non_goals, …) that every advocate receives. Divergence is now intentional creative reframing, not blind misalignment.
The PreviewDD cache key now includes idea_spec_hash, so the same one-liner with different Socratic answers gets a fresh advocate set instead of a stale replay.
v1.7 — 4 required questions, skip-interview, tiered fallback (Phase 9 — Christensen + Kim-Mauborgne + Taleb)
Hackathon demo feedback: 12 questions before seeing any output is too many for a 3-minute pitch. v1.7 trims the contract:
- B-1 — 4 required, 5–8 optional (
persona.profile/surface.platform/killer_feature/must_have_constraints[≥1]). Best path: 4 clicks total to land on the gallery. Fullest path: 12 questions for deep dive. User choice per modal. - B-3 — Skip interview option in Batch A. One click writes a 3-field stub (
_schema_version+_filled_ratio+idea_summaryonly) and short-circuits to the v1.5.4 raw-idea path. Demo escape hatch when the interview itself becomes friction. - A-4 —
_filled_ratiotiered fallback. The hard 0.5 gate is gone. Now≥ 0.7= high-confidence ground truth,0.4–0.7= hint,0.2–0.4= low-confidence,< 0.2= drop spec entirely (Skip-interview lands here at ratio ≈ 0.11). No path is blocked.
The flow inverts the SaaS-onboarding default of "configure → preview." Instead: answer 4 questions → see 9 / 18 / 26 mockups → pick one. The picture is the spec. SpecDD and TestDD only run on the picture you approved. (Godin: lead with the artifact, not the form.)
We release patches and feature updates frequently (see CHANGELOG.md and Releases). To update your local install:
# Check installed version
claude plugin list | grep -A2 pf@two-weeks-team
# Pull the latest manifest + plugin contents from the marketplace
/plugin marketplace update two-weeks-team
# Upgrade the plugin to the newest listed version
/plugin update pf@two-weeks-team # if you have this subcommand
# — or, if update is not available in your Claude Code version —
/plugin uninstall pf@two-weeks-team
/plugin install pf@two-weeks-team
# Reload so hooks, agents, and commands refresh
/reload-pluginsAfter updating, run pf check (or /pf:bootstrap once, then pf check) to
confirm your local ~/.claude/preview-forge/memory/ is still intact — the
update does not overwrite your LESSONS.md, so any cross-run learning
you've accumulated is preserved.
Downgrading (if a new version breaks something):
/plugin uninstall pf@two-weeks-team
/plugin install pf@two-weeks-team@1.0.0 # any past version tagEvery release is signed via GitHub Releases,
so you can verify the manifest version in plugin.json matches the tag.
Preview Forge ships 14 slash commands under the /pf:* namespace:
| Command | Purpose |
|---|---|
/pf:bootstrap |
Initialize plugin memory + seed workspace Bash permissions — first time per workspace |
/pf:new <idea> |
Start a new run (PreviewDD cycle begins) |
/pf:status |
Current run state, agent progress, blackboard |
/pf:retry <agent|phase> |
Rerun a failed agent or stuck phase |
/pf:freeze |
Force Judges + Auditors evaluation (TestDD Stage 7) |
| Command | Purpose |
|---|---|
/pf:design |
Gate H1 — Claude Design main / built-in Studio fallback |
/pf:panel |
Manually trigger 4-Panel (TP/BP/UP/RP) vote |
| Command | Purpose |
|---|---|
/pf:gallery |
Browse / fork past runs |
/pf:replay <run> |
Deterministic replay from trace.jsonl |
/pf:seed |
Pre-verified demo idea bank (10) |
/pf:export <run> |
Package frozen run as tarball or Claude Code plugin |
| Command | Purpose |
|---|---|
/pf:budget |
Cost dashboard — per-run / per-cycle / per-agent |
/pf:lessons |
Cross-run failure catalog (LESSONS.md) |
/pf:help |
Full 14-command reference + FAQ |
Preview Forge's 143 agents live in a 6-tier hierarchy + SQLite blackboard:
M1 Run Supervisor (Meta)
│
┌────────────────┼────────────────┐
│ │ │
M2 Cost Monitor M3 Chief Eng PM Software-Factory
(tracking only) (all dept leads) Layer-0 Hooks
│
┌──────────┬───────────────┼────────────────┬─────────────┐
│ │ │ │ │
Ideation 4 Panels + Spec Dept 5 Engineering QA Dept +
Dept Mitigation (9) Teams (25) SCC + Judges +
(29) Designer (45) Auditors + Docs
(32)
Count: 3 Meta + 29 Ideation + 45 Panels + 9 Spec + 25 Engineering + 14 QA + 5 SCC + 5 Judges + 5 Auditors + 3 Docs = 143. All Opus 4.7, zero Sonnet/Haiku.
- Claude Code (latest) with Pro / Max / Team / Enterprise subscription. (No separate API key needed.)
- Node.js 20 LTS + pnpm 9 (for scaffolded apps' build/test)
- Docker 24+ (optional, for scaffolded apps'
docker compose upverification)
| Area | Count | Summary |
|---|---|---|
| Agents | 143 | 10 departments, 6 tiers, all Opus 4.7 |
| Slash commands | 14 | /pf:* namespace |
| Hooks | 3 | factory-policy.py, askuser-enforcement.py, auto-retro-trigger.py |
| Memory seed | 3 | CLAUDE.md + PROGRESS.md + LESSONS.md (with 3 bootstrap lessons) |
| Methodology | 1 | Layer-0 7 non-negotiable rules |
| Asset templates | 4 | Docker Compose, Caddyfile, nestia.config.ts, install.sh |
| JSON schemas | 3 | PreviewCard, PanelVote, ScoreReport |
| Seed ideas | 10 | Pre-verified demo scenarios |
| Slash commands | 14 | /pf:* |
| CLI | 1 | bin/pf |
| Verification | 1 | scripts/verify-plugin.sh (56 checks) |
Preview Forge uses only Anthropic-native features:
- Claude Code (Pro/Max) · Claude Opus 4.7 · Adaptive thinking ·
xhigheffort - Claude Managed Agents · Anthropic Memory Tool · Batch API · Files API · Citations
- Context editing (
context-management-2025-06-27) · Compaction (compact_20260112) - Prompt caching (1-hour TTL) · Fine-grained tool streaming · Task budgets (
task-budgets-2026-03-13) - Claude Design (Gate H1 main) · Built-in Design Studio (Gate H1 fallback)
Not used: Figma, Google Fonts, external CDNs, analytics services. All 26 mockups are single-file HTML with inline styles only.
Preview Forge maintains a 4-layer memory so mistakes don't repeat across runs:
memory/CLAUDE.md— session rules (read first every run)memory/PROGRESS.md— run index (updated at run end)memory/LESSONS.md— failure catalog (auto-appended by Auto-retro critic)- Anthropic Memory Tool (
memory_20250818) — per-agent episodic memory (Reflexion pattern)
M1 Run Supervisor reads all four before every new run and pre-loads relevant lessons to every Department Lead.
- 📘 Full v8.0 Specification — canonical, 2,100+ lines
- 📝 CHANGELOG — phase-by-phase build log
- 🛡️ Security Policy — reporting and scope
- 🤝 Contributing — LESSONS, new advocates, etc.
- 🪶 Layer-0 Rules — 7 non-negotiable
git clone https://github.com/Two-Weeks-Team/PreviewForgeForClaudeCode
cd PreviewForgeForClaudeCode
bash scripts/verify-plugin.sh # 56/56 checksBuilt for the Anthropic × Cerebral Valley Built with Opus 4.7 hackathon (April 21–28, 2026). Targeted prize categories:
- 🏆 Most Creative Opus 4.7 — 143 parallel personas + self-critic + self-scoring
- 🏆 Best Managed Agents — hours-long build/test/correct cycles in a managed session
- 🏆 Keep Thinking — "TDD + SpecDD didn't touch ideation. We put PreviewDD in front."
Apache-2.0. See NOTICE for attribution.
Built with Claude Opus 4.7 · Powered by Claude Code Plugins · Zero third-party deps · Apache-2.0
