surface is a pattern + skill that lets an agent generate ephemeral, structured UI surfaces to collect ad-hoc input from a user, and react to submissions autonomously. The surface is a URL pointing at agent-rendered HTML; the agent owns what each affordance means; submissions arrive in known shape. Surfaces are also a primary tool for showing users information — tables, grouped lists, flagged rows, and rich layout communicate at a glance what chat text or a static document cannot. v0 ships the skill bundle, four reference servers (Go, Python, Node, Rust) for the local-loopback substrate, and a hosted-substrate wire walkthrough (references/hosted-example.md). No installable binary yet; a clean-room Cloudflare Worker reference is still planned.
Agents already have three ways to collect structured input, each with a gap: a chat reply (unstructured, and only if the user is in chat), an inline chat-client widget (structured, but trapped inside a supported chat surface), or a purpose-built app or form (full UI, but real build-and-maintain cost). surface fills the space between them — task-shaped UI the agent generates for the moment and discards.
The honest objection is "isn't this just a web form?" The answer is no, and the reason is the shift in who builds it and how disposable it is:
- The cost of bespoke collapsed to the cost of asking. A form builder gives you fixed fields and one respondent. surface lets an agent generate a UI shaped to the task — a drag-to-rank, a floor-plan annotator, a refereed two-player game, a flagged-transactions review with per-row decisions — in the time it takes to describe it, then throw it away. When making a custom interactive surface gets that cheap, the calculus flips: interactions that were never worth building a UI for (too one-off, too oddly-shaped, too ephemeral) become worth a surface, because nobody has to build and own anything.
- The URL carries the whole interaction. Because the response surface lives at the URL, any outbound channel — email, SMS, push, a paging system — becomes a reply path, not just a notification. That reframes "the user isn't in chat" from a dead end into a delivery choice.
- Reactions are code, so monitoring is cheap. The agent encodes the drain-and-react logic and lets it run; it only re-engages for submissions that genuinely need judgment. Watching a live surface is not an LLM-call-per-interaction tax — the mechanical reactions cost nothing once written.
- Ephemeral by design. surface is for the moment, not forever. For durable, recurring needs, a real app or form tool is the right call — and the skill says so. The boundary is the point: surface fills the gap below the threshold where standing up and maintaining a tool makes sense.
surface deliberately ships narrow and grows on real-use signal. Currently out of scope:
- A bundled/installable server binary. v0 is skill-only — the reference servers in
skills/surface/examples/exist to be read and re-implemented, not installed. A canonicalsurface-serveis a v1 question. - Templating / surface-authoring helpers. The agent writes the HTML/JS directly; a helper layer waits on friction signal.
- Substantive prompt-injection mitigation patterns.
references/security.mdnames the caution; deeper sanitization guidance accrues as real untrusted-input use does. - Persistent surfaces, link expiration, one-time-use semantics. Surfaces are ephemeral; agents handle lifetime in their own state if they need it.
(Hosted deployment and a push/WebSocket transport were once on this list — both have since shipped, as references/hosted-example.md and references/websocket-example.md.)
The fast path — hand it to your agent. You don't have to follow steps by hand. In any agent session (Claude Code, Codex, or Cowork), paste this one prompt — it installs the skill, runs setup, and builds your first demo:
Install the surface skill from https://github.com/aac/surface: clone the repo
and run ./install.sh (it auto-detects Claude Code or Codex and symlinks the
skill into my skills directory). Then run surface's setup step so it records
what my environment can do, to ~/.surface/environment.md. Then use the surface
skill to build me a tic-tac-toe game I can play in my browser — I'll click to
play X, and you drain each of my moves off the wire and play O back onto the
same board, until someone wins.
That single prompt covers all three stages — install → setup → first demo. Everything below is the same, broken out for doing any stage by hand.
The repo is packaged for both Claude Code and Codex. The actual skill bundle lives under skills/surface/ and is harness-neutral — the same skill bytes load on either harness. Only the packaging wrapper differs.
.claude-plugin/plugin.json at the root declares the Claude plugin. Two install paths:
Claude Code CLI (skills-dir symlink):
git clone <repo-url> ~/Workspace/surface
ln -s ~/Workspace/surface/skills/surface ~/.claude/skills/surfaceClaude Desktop / Cowork (plugin install): install the plugin from this repo so it appears in the customize view. Plugin discovery is driven by .claude-plugin/plugin.json at the repo root; the skill is auto-discovered under skills/surface/.
.codex-plugin/plugin.json at the root declares the Codex plugin. Skill bundle discovered under skills/surface/.
Codex CLI (skills-dir symlink):
git clone <repo-url> ~/Workspace/surface
ln -s ~/Workspace/surface/skills/surface ~/.codex/skills/surfacePlugin install: point Codex at this repo; the .codex-plugin/plugin.json manifest carries the skill pointer and metadata.
For offline use or environments without plugin marketplace access, install.sh at the repo root handles detection, symlink install, and uninstall for both harnesses:
git clone https://github.com/aac/surface.git
cd surface
./install.sh # auto-detects Claude Code or Codex
./install.sh --target claude # override to Claude Code
./install.sh --target codex # override to Codex
./install.sh --uninstall # remove the installed skillThe script can also be piped via curl — it will clone the repo to ~/.local/share/surface/ if no local checkout is found. See the # Usage: block at the top of install.sh for full details.
Codex lifecycle primitive mapping. The mechanism categories in skills/surface/references/lifecycle.md (push-stream, polling, FS watch, hosted poll, webhook) are harness-neutral; primitives differ:
| Category | Claude Code | Codex |
|---|---|---|
| Push-stream on subprocess stdout | Bash(run_in_background) + Monitor |
Long-running exec_command session + write_stdin/output polling |
| Scheduled wake-ups for cadence | ScheduleWakeup, /loop |
Heartbeat automations |
| FS drop-directory watch | fswatch/inotifywait/polling |
Same — OS-level primitives are harness-neutral |
| Hosted poll | WebFetch / HTTP |
WebFetch / HTTP — same |
| Tear-down | KillShell |
Codex session/process-group teardown |
A surface that requires the user to come back to chat and say "I clicked it" has failed the pattern. Any Codex adaptation must include a real drain path (long-running stdout polling, drop-directory polling, heartbeat-driven re-check, hosted poll, or webhook where available).
Each harness loads skills/surface/SKILL.md and what it explicitly references; everything else in this repo is for humans reading the project.
The first time you use surface interactively, the agent runs a quick setup pass: it surveys what your environment can do — local loopback, any tunnel CLIs, any hosted substrate you've configured — and records the findings (plus where the credentials for non-local delivery live, the locations, not the secrets) to ~/.surface/environment.md. Later runs read that file instead of re-probing, and an agent running autonomously (cron, a scheduled job) reads it instead of asking you.
For a local browser demo you don't need to set anything up — loopback works out of the box, so "ask the agent for a tic-tac-toe game" just works. Setup earns its keep the moment you want surface to deliver a URL through another channel (email, SMS, a hosted endpoint): that's what the environment file is for. The one-paste prompt above runs this step for you.
You never run surface yourself. You tell your agent what you want; it mints the surface, serves it, hands you a URL, and reacts to whatever you do on the page — no coming back to chat to say "I clicked it."
Tic-tac-toe you actually play (the flagship). Paste this:
Use the surface skill to build me a tic-tac-toe game I can play in my browser.
I'll click squares to play X; you drain each move off the wire, play O, and
push the updated board back onto my screen. Keep going until someone wins.
You get a real game in a browser tab: your clicks land as X, the agent plays O back onto the same board within a second or two, and it calls the win. That one prompt exercises every part of the pattern — opaque-ID affordances, autonomous draining, react-on-the-surface — on a rich rendering canvas.
More to ask for — same pattern, wildly different shapes:
- "Show me these 8 thumbnails and let me click the one to ship."
- "Give me a drag-to-rank surface for these 10 priorities, then tell me the order I chose."
- "Render these 30 flagged transactions as a table with approve / reject per row, and act on my picks."
- "Build an approval gate for this deploy, text me the link, and proceed only once I approve."
Each is a single ask. The agent shapes the UI to the task and throws it away when the task is done.
Shipped as the skill (Claude loads these at runtime, all under skills/surface/):
| Path | Purpose |
|---|---|
skills/surface/SKILL.md |
Skill entry point — what surface is, when to use it, links into the references. |
skills/surface/references/pattern.md |
The substrate-agnostic pattern. The contract every implementation must preserve. |
skills/surface/references/wire-example.md |
One concrete wire (HTTP + JSON over localhost). Illustrative, not normative. |
skills/surface/references/lifecycle.md |
The mechanism space for autonomous draining (Monitor, polling, fs watch, push webhook). |
skills/surface/references/security.md |
Trust boundary, deployment posture, free-field content as injection vector. CSRF + URL-unguessability notes for non-loopback deployments. |
skills/surface/references/hosted-example.md |
Cloudflare Worker + KV wire walkthrough — sibling to wire-example.md, for the hosted substrate. |
skills/surface/examples/server.go |
Go reference server implementing the wire example. Supports either stdout (SUBMIT lines) or filesystem-drop drain via --drain-mode={stdout,fs}. Read it for orientation, re-implement in whatever fits. |
skills/surface/examples/server_test.go |
Tests for the Go reference. |
skills/surface/examples/server.py |
Python stdlib reference, independently derived from the references (not Go-mirrored). Diverges from the Go sibling on operational details (port 8000, no parent-death watchdog, hard 32 MiB multipart cap) — same wire contract. |
skills/surface/examples/server.mjs, server.test.mjs |
Node stdlib reference (node:http), independently derived from the references; node:test suite (21 cases). |
skills/surface/examples/rust/ |
Rust reference server (zero-dependency std::net), independently derived from the references. Cargo project — cargo run / cargo test. |
skills/surface/examples/inline-reveal.py, inline-reveal.md |
Minimal worked example of SKILL.md §6 Rule 5 ("the surface owns the result"): the /submit response carries the payload and the page swaps it into an inline panel — no chat bounce. |
skills/surface/examples/tic-tac-toe.html, tic-tac-toe.md |
Worked capability demo — a tldraw tic-tac-toe surface (the recipient plays, the agent drains moves and replies on the board), with the .md explaining how it maps onto the pattern. |
The Python, Node, and Rust references were each built without their author reading the other siblings — derived from skills/surface/references/ alone. The operational divergences (different ports, watchdog choices, error-status policies) are the validation: the pattern survives independent re-derivation. A clean-room hosted Cloudflare Worker reference is planned the same way — derived references-only, not ported — and isn't in the tree yet.
Packaging (harness-specific plugin wrappers, not loaded as part of the skill):
| Path | Purpose |
|---|---|
.claude-plugin/plugin.json |
Claude plugin manifest. Makes the bundle installable as a Claude Desktop / Cowork plugin; skills under skills/ are auto-discovered. |
.codex-plugin/plugin.json |
Codex plugin manifest. Sibling to the Claude manifest with the same skills/ pointer and lockstep version. Harness-neutral skill content stays under skills/surface/. |
For humans (not loaded by Claude):
| Path | Purpose |
|---|---|
README.md |
This file. |
LICENSE |
Apache 2.0. |
AGENTS.md |
Conventions for agents and contributors working on surface itself (load-bearing design principles, branch policy, halt conditions). CLAUDE.md is a thin shim that imports it so Claude Code auto-loads it. |
docs/decisions.md |
Running log of substantive design choices and rejected proposals, with reasoning. |
skills/surface/go.mod |
Go module declaration for the reference server. |
skills/surface/SKILL.md+references/— the canonical design: the pattern, the wire example, lifecycle mechanisms, security stance. Start here to understand the shape of the thing.AGENTS.md— the load-bearing principles (trust the agent, pattern is the contract, autonomous draining is foundational). Read before changing anything in the skill bundle. (CLAUDE.mdjust imports this for Claude Code.)docs/decisions.md— why specific design calls were made (and why proposals were rejected). Read before re-opening a settled question.
This skill phones no one home. It emits no telemetry and collects no data. Any outbound network activity happens only through hosted-substrate references the user explicitly opts into (e.g., the Cloudflare Worker reference), and is entirely controlled by the operator.
Licensed under the Apache License, Version 2.0. See LICENSE for details.