Atelier

The design-token discipline layer for AI-generated codebases.

v0 generates. Cursor edits. Atelier enforces.

The problem

Open any AI-generated React component. Count the raw color refs:

// Generated by v0 / Cursor / Claude Code last week
<button className="bg-zinc-900 text-zinc-50 hover:bg-zinc-800 border border-zinc-200">
  Save
</button>

Now open the project's tailwind.config.ts. There's a perfectly good bg-foreground token sitting right there. The agent didn't use it.

Multiply by 50 components. Different shades of zinc on different pages. Theme switching is impossible. The "design system" is an artifact of the original setup that nothing actually conforms to.

This isn't an agent bug. It's a missing contract. AI tools generate against the densest pattern in their training data — Tailwind's raw palette — unless something stops them.

What Atelier does

Atelier is the lint layer that stops them. It builds on @google/design.md (the Google Labs spec for machine-checkable design tokens) and adds three things the spec doesn't ship:

A precedence rule. When tokens are resolvable from multiple sources — your local DESIGN.md, a build-category default, the raw Tailwind palette — Atelier defines what wins. Project intent always survives a regeneration pass.
An empirically-derived 8-role vocabulary. Every production DESIGN.md we surveyed used different names for the same eight roles. Atelier defines the canonical set so a linter can check role coverage regardless of internal naming.
A build-category atlas. Seven categories (SaaS dashboard, marketing landing, trading analytics, conversational UI, internal ops, multi-LLM synthesis, marketplace listing) with sensible default DNA. New projects inherit a real starting palette instead of raw Tailwind.

The same machinery ships as a CLI, a project audit, a TypeScript classifier, and an MCP server agents can call directly to lint their own output.

Does it work?

+149.98% relative lift in semantic-token conformance. Three-arm benchmark, 24 repos, pre-registered methodology, two-classifier sensitivity check.

Arm	Conformance (v2-broad)	n
Repo with `DESIGN.md`	78.5%	3/4
`shadcn`-style default vocab	61.7%	10/10
Raw Tailwind palette only	31.4%	9/10

DESIGN.md arm beats shadcn-default by +16.87pp absolute / +149.98% relative. The pre-registered primary gate required ≥+15pp absolute. Verdict: PASS.

The strict-vs-broad sensitivity check agrees on the verdict. Methodology spec was committed to git before the runner saw it. Caveats and per-repo numbers are in benchmarks/results/2026-05-07-phase-1-v2.md. The corpus is single-developer; external validation requires fork-and-rerun. A generative arm-vs-arm study (causation, not correlation) is on the v0.2.0+ roadmap.

Quickstart

npm install -g @atelier-oss/cli

# In any project root
atelier init                 # writes a starter DESIGN.md
atelier lint DESIGN.md       # validates the token contract
atelier classify .           # scores token-vs-raw conformance
atelier atlas fingerprint .  # detects build category (saas-dashboard, marketing-landing, ...)
atelier audit                # six-section project health check

First useful signal in under 30 seconds: atelier classify . returns a single number per file — the share of color/spacing/typography references that resolve to declared tokens vs raw values. Run it before and after your next AI-generated PR.

How precedence works

┌──────────────────────────────────────────────────────────┐
│  Explicit  →  the project's own DESIGN.md                │
│       ↓                                                  │
│  Atlas     →  build-category default DNA (saas-dashboard,│
│               marketing-landing, ...)                    │
│       ↓                                                  │
│  Palette   →  raw framework fallback (Tailwind zinc-*,   │
│               blue-*, ...)                               │
└──────────────────────────────────────────────────────────┘
        Higher source ALWAYS wins.
        Atlas MUST NOT silently shadow explicit.
        Palette is last-resort; lint warns on use.

This encodes early-return semantics for codegen pipelines. A regenerate pass that starts with palette refs gets warned. A new file that reaches for bg-zinc-900 when bg-foreground is declared gets warned. Your project's intent — whatever's in your DESIGN.md — is uncontestable.

Findings:

ATELIER_PRECEDENCE_VIOLATION (warning) — atlas default shadowed an explicit token.
ATELIER_MISSING_ROLE (info) — a canonical role from the 8-role set has no satisfying token.

Neither is an error by default. Partial coverage is valid; full coverage is recommended.

The 8 canonical roles

Role	Purpose
`background`	Default page surface
`foreground`	Default text on `background`
`primary`	Brand action (CTA, links, focus targets)
`primary-foreground`	Legible text on `primary`
`accent`	Secondary emphasis (badges, highlights)
`muted`	Low-contrast supporting tone (timestamps, captions)
`border`	Surface separator color
`ring`	Focus indicator color

A DESIGN.md may use any internal naming (c-bg, bg, background are all valid). Map your names to roles via the optional aliases block:

aliases:
  background: c-bg
  foreground: c-text
  primary: c-accent

Coverage is a recommendation, never an error. Full vocab in spec/DESIGN.md.spec.md.

For AI coding agents (MCP server)

Atelier ships an MCP stdio server so any MCP-capable agent (Claude Code, Cursor, Cline, etc.) can call the linter, classifier, and audit directly:

// .mcp.json
{
  "mcpServers": {
    "atelier": {
      "command": "npx",
      "args": ["-y", "@atelier-oss/mcp-server"],
    },
  },
}

Exposed tools:

atelier_lint — lint a DESIGN.md against the spec + extensions
atelier_classify — score a file or directory for token conformance
atelier_audit — run the six-section project audit
atelier_atlas_fingerprint — detect build category from a repo path

The intended workflow: an agent generates a component, calls atelier_classify on the diff, and rewrites any raw palette references before opening the PR. Closes the loop without a human in it.

Packages

Package	Role
`@atelier-oss/cli`	The everyday binary. `atelier init` / `lint` / `classify` / `atlas` / `audit`.
`@atelier-oss/lint`	Wraps `@google/design.md@0.1.1` (Apache-2.0) and adds the precedence rule + 8 roles.
`@atelier-oss/classify`	Token-vs-raw scorer. The engine behind the +149.98% number.
`@atelier-oss/atlas`	Fingerprints a repo, returns build category and default DNA.
`@atelier-oss/audit`	Six-section health check: token usage, contrast, motion, a11y, design coverage, responsive.
`@atelier-oss/mcp-server`	MCP stdio server exposing the above as agent-callable tools.

Each package ships independently via changesets. All published with sigstore provenance attestations.

How Atelier compares

Tool	Lints DESIGN.md	Precedence rule	Role vocab	Build-category atlas	Agent / MCP support	Empirical lift number
Atelier	yes	yes	yes (8 roles)	yes (7 categories)	yes (MCP server)	+149.98% relative
`@google/design.md`	yes	no	no	no	no	n/a
`eslint-plugin-tailwindcss`	no	no	no	no	no	n/a
`style-dictionary`	no	no	no	no	no	n/a
Hand-rolled `tailwind.config.ts`	no	no	no	no	no	n/a
`shadcn/ui` defaults	no	no	partial (~6 roles, undocumented)	n/a (defaults only)	no	n/a

Atelier is additive to @google/design.md. Both extensions are proposed upstream as a PR (google-labs-code/design.md#76). If the spec adopts them, this layer becomes a no-op compatibility shim.

Methodology — where the +149.98% comes from

Three-arm observational benchmark, pre-registered before the runner expanded:

Arm A — repos with a committed DESIGN.md (n=4, 3 kept after dropouts)
Arm B — repos using shadcn-style default vocabulary (Radix UI + shadcn token names) without DESIGN.md (n=10)
Arm C — repos using raw Tailwind palette refs only (n=10, 9 kept)

Each repo gets walked, every Tailwind class extracted, and each class scored against the project's declared token registry. Token-conforming references count as 1; raw palette references count as 0. Per-repo conformance is the mean. Arm conformance is the mean of repo means.

Two classifier modes for sensitivity:

v2-strict — only DESIGN.md-declared tokens count as registry. Hard test of explicit-spec adoption.
v2-broad (primary gate) — DESIGN.md + tailwind.config + CSS variables count as registry. Reflects shipped reality.

Both modes agree on the verdict. The pre-registered primary gate (DESIGN.md ≥ shadcn-default + 15pp absolute) was crossed at +16.87pp.

Caveats kept honest:

Observational, not causal. The next milestone is a generative study (same prompt, with vs. without DESIGN.md in context).
Single-developer corpus. External validation requires fork-and-rerun.
The first MVB (4+4 repos, two-arm) reported +105% relative lift but conflated shadcn-style and raw-palette in the control. The three-arm split above is the corrected number.

Re-run anytime:

python3 -m benchmarks.runner       # corpus walk + scoring
python3 -m benchmarks.parity_check # 60/60 oracle for the TS port of the scorer

Spec at benchmarks/spec-v2.md. Result at benchmarks/results/2026-05-07-phase-1-v2.md.

Roadmap

v0.1.0 (shipped 2026-05-08) — Tailwind v3 support, six packages on npm with sigstore provenance, three-arm benchmark cleared, upstream proposal PR open as draft.

v0.2.0 — Tailwind v4 support: @theme blocks as a registry source, oklch() palette in atlas, mixed-mode detection. Plan: docs/v0.2.0-plan.md.

v0.3.0 — Generative benchmark arm (causation, not correlation). Same prompt run with and without DESIGN.md in context, conformance measured at generation time. ~$50-100 in API.

v1.0.0 — Pending upstream maintainer signal on PR #76. If the precedence rule and role vocab land in @google/design.md, the wrapper collapses and v1.0 ships as a thin convenience layer. If they don't, v1.0 is the long-lived fork.

Status

6 packages live on npm: @atelier-oss/cli, /lint, /classify, /atlas, /audit, /mcp-server. All v0.1.0, all signed.
138 TypeScript tests + 65/65 parity oracle + 39 Python validators. CI green on every push.
Upstream PR open as draft at google-labs-code/design.md#76. Awaiting maintainer signal post-CLA.

Contributing

The project is small and most decisions are still open. Things that would help right now:

Run the benchmark on your own repos. Fork, point the runner at your codebase, open an issue with the result. External validation is the single biggest open question.
Try the MCP server with your agent of choice. Open an issue with the workflow that worked (or didn't).
Submit DESIGN.md examples. The 8-role vocab was derived from 4 production files. Wider sampling sharpens the canonical set.
Question the role list. If your DESIGN.md needs a role outside the 8, that's a spec proposal worth making.

PRs welcome. Issues are the right place to start for anything bigger than a typo. Conventional Commits, please.

License

MIT for Atelier code. Apache-2.0 for the bundled @google/design.md dependency (preserved verbatim, full NOTICE at repo root). The wrapper layer is additive; if the upstream package adopts the extensions, this fork collapses.

Credits

Google Labs for the DESIGN.md spec.
The four anonymous DESIGN.md authors whose naming patterns shaped the canonical role list.

If Atelier saves you a debugging hour, star the repo. It's the cheapest way to tell us the lift number isn't an artifact of the corpus.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.changeset		.changeset
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
packages		packages
spec		spec
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Atelier

The problem

What Atelier does

Does it work?

Quickstart

How precedence works

The 8 canonical roles

For AI coding agents (MCP server)

Packages

How Atelier compares

Methodology — where the +149.98% comes from

Roadmap

Status

Contributing

License

Credits

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Atelier

The problem

What Atelier does

Does it work?

Quickstart

How precedence works

The 8 canonical roles

For AI coding agents (MCP server)

Packages

How Atelier compares

Methodology — where the +149.98% comes from

Roadmap

Status

Contributing

License

Credits

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages