Design the extension entry-point ladder beyond `transform_corpus`

This issue is the home for the design discussion that came out of PR #1196's review.

## Background

PR #1196 introduces the corpus-mutation extension system with a single entry point: a script defines a `transform_corpus(corpus)` global, and the host calls it. This is the simplest possible shape (let's call it **Option 1**) and it's what shipped in that PR.

During review, Alan raised the question of whether this shape is appropriate, particularly as we add:

- More lifecycle hooks (e.g. `before_extract`, `after_render`).
- Extensions that register multiple capabilities from one file (a corpus transform and a Handlebars helper).
- Extensions that bundle non-code files (templates, assets, configuration).
- Extension enabling/disabling.

The PR ships Option 1. This issue captures the trade-off analysis and lays out the ladder of richer options we can add as needs arise.

## The four entry-point patterns

### Option 1: Reserved function names

Script defines a function with a known name (`transform_corpus`); the host introspects.

Examples: pytest, Sphinx.

Trade-off: minimal syntax, but each new hook reserves another global. Doesn't scale past a small fixed set.

My evaluation: right starting point for Mr. Docs. The "reserves a global per hook" weakness only bites once the hook count grows; we have one hook today. pytest and Sphinx are mature systems using this pattern successfully.

### Option 2: Top-level registration calls

Script calls `host.register_*(fn)` in top-level code; the host stores the registration and invokes the callback at the right time.

Exampled: Darktable, LLVM/Clang plugins.

Trade-off: explicit, scalable, one file can register multiple things; syntactically heavier.

My evaluation: necessary eventually; not yet. Becomes the right move when we want paired helpers (one file registering both a corpus transform and a Handlebars helper) or more lifecycle hooks. IMHO pre-paying for it before there's a concrete need is overkill.

### Option 3: Reserved `register` function + event emitter

Scripts export one reserved name (`register`); inside, it subscribes to host events.

Example: Antora.

Trade-off: single reserved name + familiar event pattern; adds an emitter abstraction layer.

My evaluation: probably not the right rung for Mr. Docs. Antora's pattern fits a pipeline with many extension points throughout the build; we have fewer. The emitter abstraction is overhead for our shape. We could skip rung 3 and jump from rung 2 straight to rang 4 if/when needed.

### Option 4: Manifest + accompanying code

An extension is a directory: a manifest file (JSON/YAML) declares the extension name and capabilities; one or more accompanying files contain the actual logic.

Examples: Claude Code skills (Markdown frontmatter + body).

Trade-off: most expressive; supports paired helpers, auxiliary files, enable/disable, configurable extensions. Requires the most infrastructure.

My evaluation: the right answer once we want enable/disable, named extensions, auxiliary files, or configurable extensions. Heaviest but most expressive. The natural top of the ladder.

## The ladder

The options aren't mutually exclusive. They form a complexity ladder:

| Rung | Pattern | What you get |
|---|---|---|
| 1 | Reserved name (Option 1) | Simplest case: one capability, one file, no ceremony |
| 2 | Registration calls (Option 2) | Shared data: one file registers multiple capabilities (e.g., a corpus transform alongside a Handlebars helper) |
| 3 | Manifest + code (Option 4) | Shared files: an extension is a directory bundling code, helpers, and assets |
| ... | ... | enable/disable, configuration schemas, ... |

PR #1196 ships rung 1. Higher rungs land as concrete use cases surface.

## Future questions to settle here

These came up in the PR review. They are not blocking PR #1196 but should inform the ladder above.

- **Paired helpers**: should one extension file be able to register both a corpus transform and a Handlebars helper? This forces rung 2+.
- **Auxiliary files**: should an extension be a directory with assets/templates/config, not just a script? This forces rung 3.
- **Enable/disable**: how do users opt individual extensions in or out? Likely needs a config-side knob and probably an extension name (which forces a manifest).
- **Registering generators**: should extensions be able to add new output formats (e.g., a Markdown generator)? Forces rung 3 and a richer registry.
- **Invariant safety**: we all seem to agree that extensions should not break invariants; but some features require breaking them. As real use cases land, this tension will need a concrete resolution (tighter allowlist, opt-in unsafe mutations, post-hoc validation, etc.).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design the extension entry-point ladder beyond `transform_corpus` #1210

Background

The four entry-point patterns

Option 1: Reserved function names

Option 2: Top-level registration calls

Option 3: Reserved `register` function + event emitter

Option 4: Manifest + accompanying code

The ladder

Future questions to settle here

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Rung	Pattern	What you get
1	Reserved name (Option 1)	Simplest case: one capability, one file, no ceremony
2	Registration calls (Option 2)	Shared data: one file registers multiple capabilities (e.g., a corpus transform alongside a Handlebars helper)
3	Manifest + code (Option 4)	Shared files: an extension is a directory bundling code, helpers, and assets
...	...	enable/disable, configuration schemas, ...

Design the extension entry-point ladder beyond transform_corpus #1210

Description

Background

The four entry-point patterns

Option 1: Reserved function names

Option 2: Top-level registration calls

Option 3: Reserved register function + event emitter

Option 4: Manifest + accompanying code

The ladder

Future questions to settle here

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Design the extension entry-point ladder beyond `transform_corpus` #1210

Option 3: Reserved `register` function + event emitter