Skip to content

okf: add rationale to the recommended frontmatter fields#168

Closed
virideanil wants to merge 1 commit into
GoogleCloudPlatform:mainfrom
virideanil:okf-spec-rationale
Closed

okf: add rationale to the recommended frontmatter fields#168
virideanil wants to merge 1 commit into
GoogleCloudPlatform:mainfrom
virideanil:okf-spec-rationale

Conversation

@virideanil

@virideanil virideanil commented Jul 1, 2026

Copy link
Copy Markdown

Summary

  • Adds rationale to the §4.1 frontmatter template and the recommended field
    list: a single sentence stating why the concept exists or why its current
    value was chosen — the "why" alongside description's "what".
  • When two documents disagree, a stated rationale lets a consumer reconcile
    them by intent instead of recency alone. Design-rationale capture is a
    long-studied need (IBIS, 1970); the closest existing field is schema.org's
    backstory ("why and how an article was created"), which is scoped to
    Articles — no major standard exposes a general-purpose equivalent.
  • Backward-compatible by construction: already legal under the "Producers MAY
    include any additional keys" rule; this names it as recommended, last in the
    priority order. No parser or schema-version change; okf test suite passes
    with the change applied (33 passed).
Full reasoning — why this field, why only this field, and what was measured before proposing

Why extend rather than propose more. W3C PROV-O and C2PA already formalize
who/where/how at standards weight — those dimensions need no new field. Why is the
one dimension with no operationalized home in any major standard (checked PROV-O, C2PA,
Dublin Core's fifteen elements, and current production knowledge-base schemas).
Design-rationale capture has been a research field since IBIS (1970) without ever
shipping as a simple metadata field; this proposes the smallest possible version of it,
in the spec's own grammar. The fuller provenance extension (who/where/how blocks, explicit
unknown-value markers, signing, and the measurement code below) is public at
virideanil/claude-code-toolbox
— deliberately NOT part of this PR, because a spec ask should be smaller than an extension.

Why optional and last in priority order. A new field should claim the lowest slot
and change nothing for anyone: it is already legal today under the Extensions rule; this
only names it so producers converge on one key instead of many.

What was measured before proposing. Three paired experiments, the last on this
repo's own live example bundles (crypto_bitcoin / ga4 / stackoverflow — 65 content
documents, unmodified) with a de-confounded 3-engines × 3-formats matrix and paired
mid-p McNemar statistics. Reported with the nulls: structured frontmatter did not
improve retrieval over plain BM25 on unstructured prose (an earlier, better-looking
headline died under proper measurement) — but a structural "why is unknown" marker was
phrasing-invariant where text matching wasn't (1.000 vs 0.500 recall under varied
uncertainty phrasings). The conclusion that motivates exactly this field: declared
metadata earns its keep in reconciliation and in making not-knowing queryable — not in
findability. A stated rationale is the smallest unit of that.

Two small observations from the same work, separate from this PR and offered in
case useful — happy to file as issues: toolbox/mdcode's Documents layout reads
metadata.timeStamp (camelCase) where SPEC.md and the committed bundles write
timestamp, and it never maps resource — so ingesting this repo's own example bundles
through that path silently drops both fields.

🤖Worked with Claude Code

@google-cla

google-cla Bot commented Jul 1, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Adds an optional rationale key — a single sentence stating why a
concept exists or why its current value was chosen — to the 4.1
frontmatter template and the recommended field list. A stated
rationale lets consumers reconcile disagreeing documents by intent
rather than recency alone. Falls under the existing "producers MAY
include any additional keys" rule, so no parser or version change.
@virideanil virideanil force-pushed the okf-spec-rationale branch from 5ecb301 to edc3b0c Compare July 1, 2026 23:25
virideanil added a commit to virideanil/claude-code-toolbox that referenced this pull request Jul 2, 2026
Three small, independent, honestly-scoped Claude Code / OKF utilities.
okf-provenance (PerceptionRationale) is the main piece: an optional
signed provenance layer for OKF bundles, measured three experiments
deep against real external data, with the null results reported
alongside the wins. FINDINGS.md carries the full bug ledger (ours and
upstream's); RATIONALE.md carries the reasoning chain. Sibling of
GoogleCloudPlatform/knowledge-catalog#168.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
virideanil added a commit to virideanil/claude-code-toolbox that referenced this pull request Jul 2, 2026
Three small, independent, honestly-scoped Claude Code / OKF utilities.
okf-provenance (PerceptionRationale) is the main piece: an optional
signed provenance layer for OKF bundles, measured three experiments
deep against real external data, with the null results reported
alongside the wins. FINDINGS.md carries the full bug ledger (ours and
upstream's); RATIONALE.md carries the reasoning chain. Sibling of
GoogleCloudPlatform/knowledge-catalog#168.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
virideanil added a commit to virideanil/claude-code-toolbox that referenced this pull request Jul 2, 2026
Three small, independent, honestly-scoped Claude Code / OKF utilities.
okf-provenance (PerceptionRationale) is the main piece: an optional
signed provenance layer for OKF bundles, measured three experiments
deep against real external data, with the null results reported
alongside the wins. FINDINGS.md carries the full bug ledger (ours and
upstream's); RATIONALE.md carries the reasoning chain. Sibling of
GoogleCloudPlatform/knowledge-catalog#168.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
virideanil added a commit to virideanil/claude-code-toolbox that referenced this pull request Jul 2, 2026
Three small, independent, honestly-scoped Claude Code / OKF utilities.
okf-provenance (PerceptionRationale) is the main piece: an optional
signed provenance layer for OKF bundles, measured three experiments
deep against real external data, with the null results reported
alongside the wins. FINDINGS.md carries the full bug ledger (ours and
upstream's); RATIONALE.md carries the reasoning chain. Sibling of
GoogleCloudPlatform/knowledge-catalog#168.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
virideanil added a commit to virideanil/claude-code-toolbox that referenced this pull request Jul 2, 2026
Three small, independent, honestly-scoped Claude Code / OKF utilities.
okf-provenance (PerceptionRationale) is the main piece: an optional
signed provenance layer for OKF bundles, measured three experiments
deep against real external data, with the null results reported
alongside the wins. FINDINGS.md carries the full bug ledger (ours and
upstream's); RATIONALE.md carries the reasoning chain. Sibling of
GoogleCloudPlatform/knowledge-catalog#168.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
virideanil added a commit to virideanil/claude-code-toolbox that referenced this pull request Jul 2, 2026
Three small, independent, honestly-scoped Claude Code / OKF utilities.
okf-provenance (PerceptionRationale) is the main piece: an optional
signed provenance layer for OKF bundles, measured three experiments
deep against real external data, with the null results reported
alongside the wins. FINDINGS.md carries the full bug ledger (ours and
upstream's); RATIONALE.md carries the reasoning chain. Sibling of
GoogleCloudPlatform/knowledge-catalog#168.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@virideanil virideanil closed this Jul 2, 2026
@virideanil virideanil deleted the okf-spec-rationale branch July 2, 2026 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant