Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 127 additions & 0 deletions canon/principles/identity-resolved-by-protocol.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
---
uri: klappy://canon/principles/identity-resolved-by-protocol
title: "Identity Is Resolved By The Protocol — Hardcoded References Are A Cached Lie"
audience: canon
exposure: nav
tier: 1
voice: principled
stability: graduating
tags: ["principle", "anti-cache-lying", "antifragile", "references", "vodka", "protocol-layer"]
epoch: E0008
date: 2026-04-26
derives_from:
- "odd/constraint/anti-cache-lying.md"
- "canon/methods/supersession.md"
- "canon/principles/ritual-is-a-smell.md"
governs: "All cross-document identity references in canon and consumers of canon"
complements:
- "canon/principles/anti-cache-lying.md"
- "docs/oddkit/specs/oddkit-resolve.md"
---

# Identity Is Resolved By The Protocol — Hardcoded References Are A Cached Lie

> An author writing `[link](/page/some-slug)` is hardcoding a location in source. The location will drift; the source will not. Every consumer that reads this source will encounter a broken reference at some point that depends on which consumer renders it, when, and against which version of the canon. The fix is not better discipline. The fix is moving resolution out of source into the protocol that serves it.

## Summary

Identity references in canon are written as identity, not as location. Resolution to a current location belongs to the protocol that serves the canon — not to the author writing the reference, not to the consumer rendering it. When resolution lives in the protocol, every consumer gets correct, current, supersession-aware references for free. When resolution lives in source (hardcoded URLs, hardcoded paths, hardcoded index tables), every consumer reproduces drift independently.

This is anti-cache-lying applied to references. A hardcoded `/page/some-slug` is a cached projection of "this article currently lives at this URL." A hardcoded `[next article](./relative.md)` is a cached projection of "this file is currently at this relative path." A hardcoded `klappy.dev/writings/foo` link in a chat reply is a cached projection of "this URL is currently the public location." All three are lies the moment the underlying state changes — and the underlying state always changes eventually.

## The Principle

Three load-bearing claims:

1. **References in canon source are identities, not locations.** A `klappy://` URI declares "this is what I'm pointing at" — not "this is where it lives." Locations change; identities don't.
2. **The protocol resolves identity to current location at request time.** Whatever serves the canon (oddkit, in current implementation) takes a URI and returns the canonical current answer, walking supersession chains transparently. Consumers never compute locations from identities themselves.
3. **Consumers receive resolved answers and render them.** They do not maintain private indexes. They do not parse path components. They do not interpret URI structure beyond passing it to the resolver.

When all three hold, references are antifragile to file moves, supersession, renaming, and reorganization. When any one fails, drift returns.

## Why Hardcoded References Are A Smell

Every hardcoded reference is a bet that the location won't change. The bet is always wrong eventually. The cost compounds:

- **For authors**: every move requires a sweep of every consumer to update references. The sweep is never complete.
- **For consumers**: every reader who follows a stale link loses trust. The loss compounds across the corpus.
- **For canon**: supersession metadata exists but cannot help if references don't go through resolution. Authors write `superseded_by` in frontmatter and consumers still hit the dead URL.

The smell is not "links break." That's the symptom. The smell is **state expressed as authored content** — the same anti-pattern as committing a generated file, the same anti-pattern as caching a derived value in source. Anti-cache-lying names this clearly for derived data; this principle extends it to identity references specifically because identity references are the most common, most invisible, most aggressively-rotting case.

## What This Excludes

The principle is about identity references, not all references. It does NOT govern:

- **External URLs** to systems not under canonical governance (third-party docs, Wikipedia, GitHub repos that aren't ours). Those are genuinely locations, not identities. The HTTP layer handles their resolution.
- **Anchors within a document** (`#section-name`). Internal to a document, no protocol round-trip needed.
- **Code paths** in implementation files (`./utils.ts`). Build systems handle these; they're not canon references.

A reference is governed by this principle when it points to another canon document. Then identity-not-location applies.

## How This Manifests

### In writings

Every cross-reference is a `klappy://` URI. The renderer (Lovable, claude.ai, agents, future consumers) calls the resolver to get the current URL. Authors never write `/page/...` paths or `./relative.md` paths.

### In canon docs

Same as writings. Cross-references between canon, docs, and odd documents use `klappy://` URIs. The `derives_from`, `complements`, `supersedes`, and similar frontmatter fields all use `klappy://` URIs.

### In renderers

Consumers walk content for `klappy://` URIs at render time and call the resolver. Static-build consumers without request-time access call a build-time resolver and ship a manifest. Either way, source never contains a resolved URL.

### In agents

When an agent surfaces a reference in a response, it does so via `klappy://` URI, not by guessing or hardcoding a klappy.dev URL. The presence layer (chat client, document renderer) resolves at display time.

## Disconfirmers

Conditions under which this principle would be retracted:

1. **A consumer for whom request-time resolution is unacceptable AND no static-build companion is workable.** Every air-gapped agent, every offline export, every one-shot ingest. If this case becomes load-bearing, the principle weakens to "for first-party request-time-capable consumers."
2. **The resolution layer becomes itself a single point of failure that fails more often than it prevents drift.** If oddkit serves wrong resolutions at higher rate than authors would have introduced drift, the principle is net-negative. (Mitigation: the resolver is well-tested before this principle gets enforced; release-validation-gate exists exactly for this.)
3. **A class of reference emerges that is genuinely a location, not an identity.** Currently every cross-canon reference is plausibly an identity. If a real exception arises, the principle's scope narrows.

The principle survives all three falsifiers as scoping refinements rather than retractions. Real retraction requires the principle producing more drift than it prevents — which would be visible in the dead-reference audit's findings volume over time.

## Relationship to Other Canon

- **Anti-cache-lying** (`klappy://odd/constraint/anti-cache-lying`): this principle is its application to identity references. Anti-cache-lying says "don't store derived state as authored content." This principle says "the location of a reference is derived state; therefore don't author it."
- **Supersession** (`klappy://canon/methods/supersession`): five responses to drift. This principle is what makes the metadata actionable — without resolution-by-protocol, `superseded_by` in frontmatter is a fact nobody acts on.
- **Ritual is a smell** (`klappy://canon/principles/ritual-is-a-smell`): "if correctness depends on remembering a procedure, the system has delegated cognition to the wrong party." Hardcoded references depend on authors remembering to update them on every move. That's ritual. The system should act; the operator reviews.

## Generalizes To

Same architectural answer applies to other surfaces where state is expressed as authored content:

- **README index tables** that list current children of a folder.
- **Frontmatter cross-reference fields** (`complements:`, `related:`, `derives_from:`) that hardcode URIs that should resolve at read time.
- **Glossary entries** that reference defining articles.
- **Navigation menus** that hardcode current canonical paths.

Each of those is a cached projection of state. Each rots independently. The fix is the same — move the projection into the protocol layer. v1 of this principle ships only with link rot fixed via `oddkit_resolve`. The other surfaces become deferred work; when their pain is acute, the principle's prior application gives the architectural answer.

## What This Demands

Of authors: write identity, not location. `klappy://` URIs only.

Of canon governance: ban hardcoded location patterns at lint time (`oddkit_audit` — separate spec) so the principle is mechanically enforced.

Of the protocol (oddkit): provide one canonical resolution surface (`oddkit_resolve` — separate spec). Be partial-data-compliant. Be supersession-aware. Be backward-compatible.

Of consumers: call the resolver. Don't parse URIs. Don't maintain private indexes. Render whatever the resolver returns.

## See Also

- [Anti-Cache Lying](klappy://odd/constraint/anti-cache-lying) — the parent constraint this principle extends
- [Supersession](klappy://canon/methods/supersession) — the metadata this principle activates
- [Ritual Is a Smell](klappy://canon/principles/ritual-is-a-smell) — why discipline alone isn't enough
- [oddkit_resolve](klappy://docs/oddkit/specs/oddkit-resolve) — the protocol mechanism that implements this principle
- [oddkit_audit](klappy://docs/oddkit/specs/oddkit-audit) — the enforcement mechanism that prevents regression

## Origin

Graduated on 2026-04-26 from recurring broken-link reports on klappy.dev that traced to hardcoded `/page/...` and relative-path patterns in source markdown. The April 9, 2026 reference integrity audit found 85 broken references; a 2026-04-24 scan of `writings/*.md` found 11 more from articles authored just weeks earlier. Discipline alone had failed multiple times across multiple sessions. This principle names the architectural reason — identity is not location, and location is derived state — so future surfaces inherit the same answer rather than re-discovering it through their own incidents.
178 changes: 178 additions & 0 deletions docs/oddkit/specs/oddkit-audit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
---
uri: klappy://docs/oddkit/specs/oddkit-audit
title: "oddkit_audit — Action Specification (DRAFT v2 — KISS)"
audience: docs
exposure: nav
tier: 2
voice: neutral
stability: draft
tags: ["spec", "oddkit", "audit", "dead-references", "ci-gate", "vodka", "kiss"]
epoch: E0008
date: 2026-04-26
derives_from:
- "canon/methods/reference-integrity-audit.md"
- "canon/principles/partial-data-with-transparency-and-background-warm.md"
- "canon/principles/ritual-is-a-smell.md"
- "docs/oddkit/specs/oddkit-resolve.md"
governs: "Mechanical detection of dead klappy:// references at PR time"
supersedes: "DRAFT v1 (2026-04-26, four-check version)"
---

# oddkit_audit — Action Specification (DRAFT v2 — KISS)

> Walk every `klappy://` URI in canon. Call `oddkit_resolve` on each. Report the ones that don't resolve. That's the entire job.

## Conviction Shape

- **High conviction**: dead-reference detection is the load-bearing check; the audit reports, the workflow decides whether to fail; partial-data compliance is mandatory.
- **Working belief**: severity classification (error vs warning); allowlist via line-level directives only; legacy-link-pattern detection bundled in (different rule, same surface).
- **Tunable**: severity defaults; PR comment aggregation; how aggressively to flag `legacy-link-pattern` in unchanged files.

## What This Does

Input: a scope (paths + optional `since_commit`).
Output: structured findings for `klappy://` URIs that don't resolve, plus markdown link patterns we know are bad (`/page/...`, `./relative.md` in writings).

Nothing else. No terminological-drift check. No projection-staleness check. No epoch-gap check. No deprecated-terms registry. No epoch-completeness rules. No `audit_allow:` frontmatter field. Each of those was building for a problem we haven't been asked to solve. They live in the deferred-concerns ledger with explicit revisit conditions.

## Why This Is Enough

The reported pain is broken links. The mechanism that prevents broken links is calling the resolver on every URI before merge. That's one check. Bundling four checks into one action conflated four problems; cutting back to one check makes each future addition a separate decision triggered by its own pain.

When terminological drift, projection staleness, or epoch gaps cause concrete pain, each becomes its own thin action — not bolted into this one.

## The Action

### Input

```json
{
"action": "audit",
"input": {
"scope": {
"paths": ["writings/", "canon/", "odd/", "docs/"],
"since_commit": "main~1"
}
}
}
```

- `scope.paths` — repo-relative path prefixes. Default: full repo excluding `docs/archive/`.
- `scope.since_commit` — limit findings to files changed since this ref. Default: full audit. PR-mode CI sets this to merge-base.

No `checks` field. There's one check; it always runs. No `severity_floor`. Workflow decides what to fail on.

### Output

```json
{
"action": "audit",
"result": {
"status": "OK" | "FINDINGS" | "PARTIAL_INDEX",
"summary": {
"total_findings": 12,
"by_severity": { "error": 11, "warning": 1 }
},
"findings": [
{
"rule_id": "dead-reference",
"severity": "error",
"location": { "path": "writings/from-passive-to-proactive.md", "line": 47 },
"occurrence": "klappy://writings/some-broken-slug",
"message": "URI does not resolve"
},
{
"rule_id": "legacy-link-pattern",
"severity": "error",
"location": { "path": "writings/from-passive-to-proactive.md", "line": 53 },
"occurrence": "/page/writings/some-slug",
"message": "Use a klappy:// URI instead of /page/ path"
}
],
"index_state": {
"warm_count": 552,
"warming_count": 0
}
},
"server_time": "2026-04-26T02:50:00.000Z"
}
```

### Two rule_ids

`dead-reference` (severity: `error`) — a `klappy://` URI that returns `NOT_FOUND` from `oddkit_resolve`.

`legacy-link-pattern` (severity: `error`) — `[label](/page/...)` or `[label](./relative.md)` in `writings/`. These are the patterns that caused the original reader complaints; banning them at the lint level forces use of `klappy://` URIs which the resolver protects.

That's the entire rule set. Other concerns are deferred.

### Algorithm

For every markdown file in scope:

1. Extract every `[label](target)` markdown link.
2. For targets starting with `klappy://`: call `oddkit_resolve`. On `NOT_FOUND` → `dead-reference` finding. On `CIRCULAR_SUPERSESSION` → also `dead-reference` finding (the URI is functionally dead from the consumer's perspective).
3. For targets matching `/page/...` or `./*.md` in `writings/*.md`: emit `legacy-link-pattern` finding.

Other targets (external URLs, anchors, valid relative paths outside writings): ignore. Not this action's job.

### Allowlist

One mechanism, line-level only:

```markdown
<!-- audit-allow: dead-reference reason="placeholder for upcoming article" -->
[some link](klappy://writings/not-yet-published)
```

Scoped to the next markdown link. One rule_id per directive. Suppressed findings appear in the audit envelope under `suppressed_findings` (count only, not in `summary` totals) so reviewers can see what was suppressed and challenge the reason if needed.

No frontmatter `audit_allow:` field. Adding one is bloat for a problem we haven't observed.

## Partial-Data Compliance

Per the partial-data principle:

1. User-blocking path bounded by cache lookups.
2. Background warm via `ctx.waitUntil`.
3. Concrete disclosure via `index_state`.

When `status: PARTIAL_INDEX`, findings are best-effort. CI workflow handles this by treating partial-index runs as non-blocking (warning, retry on next push).

## Disconfirmers — What Would Falsify This

1. **The resolver has bugs that produce false `NOT_FOUND` responses.** Audit findings would be false positives. Mitigation: workflow respects `index_state.warming_count`; release-validation-gate on the resolver catches this before audit ships.
2. **Findings volume on first run is so high authors disable the gate.** Mitigation: workflow ships in soft-block mode; one observation cycle to assess before hard-block.
3. **The line-level allowlist proves insufficient (e.g., a template file legitimately has 30 placeholder URIs).** Triggers reconsideration of file-level allowlist (the deferred frontmatter field).

## What This Costs Us If We Don't Ship

The resolver alone fixes the consumer side. Without the audit, authoring discipline is the load-bearing layer for keeping URIs correct — and that's exactly what failed. The principle the resolver embodies stays unenforced at the source.

## Backward Compatibility

Net-new action. No existing callers.

## Migration

1. Land this spec as committed canon.
2. Implement `oddkit_audit` per the algorithm above. Promotion gated on independent Sonnet 4.6 validator pass per E0008.3 / `klappy://canon/constraints/release-validation-gate`. Validator verifies: real `NOT_FOUND` → error finding; real `FOUND` → no finding; legacy pattern in writings → error finding; line-level allowlist suppresses correctly; partial-index emits the right status.
3. Wire into `.github/workflows/canon-quality.yml` (separate artifact). Soft-block this cycle, escalate after observation.

## Open Questions (Tune During Build)

1. PR comment aggregation when findings volume is high. Recommendation: group by file with `<details>` collapse, cap at 50 findings rendered, link to full audit-response.json artifact.
2. Pre-commit hook performance — calling the live worker on every commit is too slow. Recommendation: defer pre-commit to a follow-up; CI-only enforcement is sufficient for v1.
3. Severity for `legacy-link-pattern` in unchanged files. Recommendation: only emit on files modified in the PR; ignore for unchanged files (avoid churning the past).

## See Also

- [oddkit_resolve](klappy://docs/oddkit/specs/oddkit-resolve) — the resolver this audit calls
- [Reference Integrity Audit](klappy://canon/methods/reference-integrity-audit) — the stopgap method this audit retires
- [Ritual Is a Smell](klappy://canon/principles/ritual-is-a-smell) — why this exists at all (correctness shouldn't depend on remembering)
- [Partial Data With Transparency And Background Warm](klappy://canon/principles/partial-data-with-transparency-and-background-warm) — partial-data compliance
- [Deferred Concerns Ledger](klappy://docs/planning/link-rot-deferred-concerns) — terminological drift, projection staleness, epoch gaps, and other deferred work

## Origin

Drafted on 2026-04-26 alongside `oddkit_resolve` (DRAFT v4). v1 of this spec proposed four checks (dead-reference + terminological-drift + projection-staleness + epoch-gaps) plus a deprecated-terms registry, epoch-completeness rules, and an `audit_allow:` frontmatter field. v2 (this revision) cut to one check and one allowlist mechanism per the operator's Vodka discipline. The other three checks and supporting registries moved to the deferred-concerns ledger with explicit revisit triggers.
Loading