A capsule is a sealed, self-contained HTML memory object for work worth preserving.
Spec home: htmlcapsule.org — landing page is itself a valid Capsule.
htmlcapsule is the open specification, reference implementation, and example set for Capsule — a profile of HTML for portable, structured, provenance-bearing snapshots of work that's worth preserving across tools, time, and people.
The hypothesis: the substrate (HTML) has already won as the universal display layer. What's missing is discipline — a contract, manifest, versioning, capabilities, and integrity guarantees — so that an LLM conversation, a database query, a hand-written report, a geospatial map, or a research summary can all preserve themselves into the same envelope and live in the same archive. The strongest claim the format makes is multi-producer interop: LLMs (Claude, ChatGPT, Gemini), deterministic compilers (Python/Node build scripts), and human authors all produce the same envelope shape.
Capsules are not a working format — you still edit in your tools of choice. They are a publish / preserve / share format. PDFs were the previous answer; they're closed and lossy. Capsules give the same role to HTML, which is alive, programmable, and re-loadable by future tools (including future AI).
A note on terminology. Claude's "artifacts," ChatGPT's "Canvas," and similar features are working canvases — editable, iterable, live next to the chat. Capsules are what those become when sealed for preservation, sharing, and archival. Different roles, complementary tools: capsule is the seal step that comes after the canvas step.
Current state: Core spec at v0.3.0, full spec at v0.3.2, reference validator at 26 checks.
- For the format in one page: CAPSULE_CORE.md — twelve rules, pasteable into any LLM prompt.
- For the full specification: spec/CAPSULE_SPEC.md — formal definition, validation rules, security model, response protocol, integrity-hash recipe.
- For working examples:
examples/(JSON inputs for the compiler) andspec/examples/(canonical example capsules). - For domain schemas:
spec/DOMAIN_CAPSULES.md—implementation_notes,design_system,exploration_map. - For the research narrative: RESEARCH.md — what we're investigating, findings F1–F18, open questions, methodology.
- For the glossary:
GLOSSARY.md. - For positioning vs. related work:
PRECEDENTS.md.
A single self-contained .html file that packages:
- A bounded snapshot of data
- A machine-readable manifest with provenance metadata
- Embedded media (images, audio) as
data:URIs — no external loads - An interactive UI for the data (export buttons, navigation, optional in-capsule download)
- Honest version and integrity metadata
- All readable content pre-rendered in the HTML at build time (not produced by runtime JavaScript) — capsules are archives, not apps
Designed to be opened, reviewed, interacted with, and shared without requiring a server, network, or live access to the original source data.
- Not a new file format. It's a profile of HTML — files have the
.htmlextension and open in any browser. - Not a SaaS or product. No accounts, no server.
- Not a knowledge graph or second brain. Capsules are atomic outputs, not collected inputs.
- Not competing with Canvas/Artifacts/MCP. Capsules are the sealed layer downstream of working canvases — complementary, not competing.
compiler/compile.py— produces capsules from JSON + template directoriescompiler/validate.py— reference validator with 26 conformance checkstemplates/decision_board/andtemplates/news_capsule/— two compiler templates demonstrating the compile pathexamples/— sanitized JSON inputs you can compile yourself
Quick check:
# Compile a sample capsule
python3 compiler/compile.py examples/ai_tool_selection.json templates/decision_board -o /tmp/test.html
# Validate it
python3 compiler/validate.py /tmp/test.htmlThe format is designed for three producer kinds, all emitting the same envelope shape:
- LLMs (Claude, ChatGPT, Gemini, Codex) — paste CAPSULE_CORE.md into a prompt and the LLM produces a conforming capsule.
generator.kind: "llm". - Deterministic compilers — Python or Node build scripts that query a source and render to inline SVG/HTML.
generator.kind: "compiler". The integrity-hash recipe inspec/CAPSULE_SPEC.md§9.1.1 is normative and has been independently re-derived from prose alone, producing bit-identical hashes. - Hand-authored — humans writing capsules directly in a text editor.
generator.kind: "human"or"hybrid".
All three pass the reference validator. All three carry the same manifest shape with honest generator.kind declarations.
Two versions move semi-independently:
- Core spec (
CAPSULE_CORE.md): the short, pasteable-into-an-LLM-prompt version. Currently v0.3.0. - Full spec (
spec/CAPSULE_SPEC.md): the implementer-grade version. Currently v0.3.2 (doc-only patches on top of v0.3.0).
Core spec versions are tagged in git: core-v0.1.0 … core-v0.3.0. Retrieve any historical version via git show core-vX.Y.Z:CAPSULE_CORE.md.
Capsules currently answer:
- What is this? (manifest fields:
type,title,description,uuid,capsule_version) - Where does it claim to come from? (manifest fields:
generator,source,synthesis,parents[]) - Is the payload intact? (manifest field:
integrity.content_hashwith a normative canonicalization recipe in §9.1.1) - What actions does it support? (manifest field:
capabilities, with Rule 7: declared = implemented)
What the format deliberately doesn't yet answer: did the claimed author actually publish these exact bytes? That's the trust-model gap. The design sketch for a future Sigstore-shaped transparency log is parked in spec/CAPSULE_SPEC.md Appendix E.6, awaiting empirical pressure to build.
This is a research/spec project. The principles:
- Every spec addition requires empirical pressure. No new schema fields, capabilities, or hash scopes without a real producer or consumer hitting a real problem. See spec Appendix E for parked v0.4+ candidates.
- Trust signals stay honest. Hashes, manifests, and capabilities are useful only if they stay honest and legible. The strongest trust signal isn't "this validates perfectly" — it's "you can see what produced it, what data is inside, what was omitted, and what actions are actually supported."
- The corpus drives the spec. Working capsules drive spec evolution; spec inflation runs the other direction and we resist it.
- Schema-minimal. Each field has to earn its keep against "what consumer does something load-bearing with this?"
Apache License 2.0. See LICENSE. The patent grant matters for a format spec — it protects independent implementers.
| Phase | Name | State |
|---|---|---|
| 1 | Format | Exists (Core v0.3.0, full spec v0.3.2) |
| 2 | Compiler | Half-built — reference compiler + validator + corpus of working capsules |
| 3 | Domain capsules | Partial — domain.implementation_notes, domain.design_system, and domain.exploration_map documented |
| 4 | Network layer | Not built; possibly never. Capsule registry, lineage graph, importers all deferred. Trust-log primitive sketched in Appendix E.6 but not built. |