v2.3.0 — GAN-harness design-loop integration (E1–E5)
[2.3.0] — 2026-05-31 · GAN-harness design-loop integration (E1–E5)
Closes the five deltas between blitz's design loop and the planner/generator/evaluator harness in anthropic.com/engineering/harness-design-long-running-apps. Blitz already had the architecture (sprint-plan → ui-build/sprint-dev → design-critic/critic); these are the deltas, not a rebuild. Specs: docs/integrations/harness-design/.
Added
skills/_shared/design-criteria.md— single-source 5-dimension design rubric, shared by the generator (steering) and evaluator (scoring). The criteria themselves steer the model off generic defaults before any evaluator cycle.- E1 criteria-as-steering —
ui-buildPhase 3.0.1.1 carries the 5 dims ("museum quality") into generation, not just into the evaluator. Tone-conditional phrasing for informal tones. - E2 live-navigating evaluator —
agents/design-critic.mdgranted the Playwright navigation subset and navigates the live page before scoring (click primaries, exercise states, resize for responsive, read console). Newcoverage_boundaryreply field; static-screenshot path retained as fallback (never silently passes interaction dims).maxTurns15→30.browser_run_code_unsafe/browser_evaluatedeliberately NOT granted (threat-model §5 posture). - E3 iterate + pivot —
ui-buildPhase 5.4.2 flat-3 cap replaced withceiling = min(10, budget); refine-vs-pivot strategic decision after each evaluation (pivot space = the 13-tone menu). - E4 sprint-contract negotiation —
sprint-devPhase 0.6: generator↔evaluator negotiate testable acceptance before code; persisted as co-ownedscope.acceptance. Registered instate-handoff.md. - E5 capability-relative trigger —
ui-buildstandardtier evaluates only on edge-of-capability signals (novel aesthetic / interaction complexity / low generator confidence / deterministic-lane hits);highalways evaluates. Re-examine per model release; cites the v1.16.0/cohesion/det-20 detector re-justification precedent.
Changed
agents/design-critic.md— "read screenshots, not source" → "read the rendered app, not the source" (input surface expands to live DOM; the source prohibition stands).