feat(encode): DOLCHEO batch prefix + governance_source envelope (0.18.0)#114
Merged
feat(encode): DOLCHEO batch prefix + governance_source envelope (0.18.0)#114
Conversation
Retrofits oddkit_encode to the current envelope contract (canon/constraints/core-governance-baseline) and adds the DOLCHEO vocabulary features that postdate its original canary refactor: - Paragraph-prefix batch mode: [D] / [O] / [L] / [C] / [H] / [E] and [O-open P1] / [O-open P2.1]. Per-artifact output preserved. - facet='open' and priority_band fields on Open artifacts (facet of O per canon/definitions/dolcheo-vocabulary, not a seventh letter). - governance_source in result: 'knowledge_base' | 'minimal'. Two-tier cascade — encoding-types are canon-only per the baseline contract, so no 'bundled' middle tier for encode. governance_uri points at the DOLCHEO canon doc. - Minimal fallback upgraded from 5-letter OLDC+H to 6-letter DOLCHEO (adds E). Letter dedup in discovery (observation.md + open.md both claim letter O; keep the first). - Tool description rewritten; smoke test +12 assertions; CHANGELOG. Corrects a 0.17.0 release-note overstatement: only telemetry_policy was actually declaring governance_source at HEAD. Challenge remains to be fixed in the P1.3 sweep. Ref: klappy://odd/handoffs/2026-04-20-p1-2-encode-canary Ref: klappy://canon/definitions/dolcheo-vocabulary Ref: klappy://canon/constraints/core-governance-baseline
…P1.3 follow-up The branch-preview smoke surfaced that oddkit_encode does not yet implement strict-mode at the index layer. getIndex merges baseline + canon entries by design (arbitrateEntries), so an override URL without encoding-type docs still returns governance_source: knowledge_base via the default baseline rather than falling through to minimal. Telemetry_policy's strict mode uses skipBaselineFallback on getFile; getIndex lacks that option today. - Soften the override-returns-minimal assertion to "returns a valid tier value" — does not require the getIndex refactor. - Document the limitation in CHANGELOG under 'Known limitations' so consumers know the boundary of encode's current override behavior. - Adding index-layer strict-mode is tracked for P1.3. Ref: workers/src/zip-baseline-fetcher.ts (arbitrateEntries, getIndex)
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
oddkit | 03dcf09 | Commit Preview URL Branch Preview URL |
Apr 19 2026, 08:20 PM |
…facet/band The batch-mode prefix regex accepted any [A-Z] letter and allowed -open and priority bands on any letter. This caused two issues: 1. Semantically meaningless artifacts such as [D-open P1] silently got facet=open and priority_band, even though the -open facet and P-bands are exclusive to the O (Observation) letter per DOLCHEO vocabulary. 2. Unstructured input that happened to begin a paragraph with an unrelated bracketed single letter (e.g. [A], [I]) was rerouted through the batch parser, whose untagged-paragraph branch uses single-match-per-paragraph classification instead of the multi-match design of parseUnstructuredInput. This broke the back-compat claim for unprefixed input. Restricting the regex to the six DOLCHEO letters and anchoring the facet/band groups to the O branch resolves both issues.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Regex allows priority band without open facet
- Nested the priority-band capture inside the
-openfacet group in PREFIX_TAG_REGEX so a band can only be captured when the open facet is present.
- Nested the priority-band capture inside the
Preview (03dcf09eee)
diff --git a/CHANGELOG.md b/CHANGELOG.md
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,32 @@
## [Unreleased]
+## [0.18.0] - 2026-04-19
+
+### Added
+
+- **DOLCHEO batch-prefix input syntax for `oddkit_encode`** — Paragraph-split input now recognizes per-paragraph prefix tags: `[D]` (Decision), `[O]` (Observation closed), `[L]` (Learning), `[C]` (Constraint), `[H]` (Handoff), `[E]` (Encode), and `[O-open]` with optional priority band (`[O-open P1]`, `[O-open P2.1]`). Each tagged paragraph becomes its own artifact in the response array. See `canon/definitions/dolcheo-vocabulary` for the seven-dimension vocabulary. Unprefixed input still works unchanged (back-compat); TSV `LETTER\tTITLE\tBODY` input still works unchanged.
+
+- **`facet` and `priority_band` fields on encoded artifacts** — Artifacts produced from `[O-open ...]` prefixes carry `facet: "open"` and (when provided) `priority_band: "P1"` / `"P2.1"` so the Open-vs-closed distinction per DOLCHEO survives the envelope. Omitted for non-Open artifacts to keep legacy consumer output identical.
+
+- **`governance_source` on `oddkit_encode` envelope** — Encode response `result` now declares which tier served its vocabulary: `"knowledge_base"` (live canon read succeeded, canon-governed encoding-type docs parsed) or `"minimal"` (canon unreachable, six-letter DOLCHEO fallback in effect). Two-tier cascade, not three — per `canon/constraints/core-governance-baseline`, encoding-types are canon-only (not in the required-baseline manifest), so there is no `"bundled"` middle tier for this tool. The `governance_uri` field now also points at `klappy://canon/definitions/dolcheo-vocabulary` for callers that want the authoritative source.
+
+### Changed
+
+- **Minimal encoding-types fallback upgraded from 5-letter OLDC+H to 6-letter DOLCHEO** — When canon is unreachable, encode's built-in fallback now includes `E` (Encode) in addition to the original D/O/L/C/H. Open remains a facet of O per canon (surfaced via the prefix parser), not a seventh letter.
+
+- **`oddkit_encode` discovery dedups by letter** — Canon now contains separate per-type docs for closed Observation (`odd/encoding-types/observation.md`) and Open (`odd/encoding-types/open.md`), both claiming letter `O`. Discovery keeps the first and skips duplicates so the letter registry stays single-character-per-entry.
+
+- **`oddkit_encode` tool description rewritten** — Now references DOLCHEO, lists the seven dimensions, and documents the batch-prefix syntax.
+
+### Fixed
+
+- **0.17.0 release note correction: `governance_source` on encode and challenge.** The 0.17.0 entry for "`governance_source` on refactored tool envelopes" claimed challenge, encode, and telemetry_policy all declared the tier signal. In practice only telemetry_policy did at HEAD — challenge and encode's envelopes were silent. This release retrofits encode's envelope to declare it. Challenge remains to be fixed in the P1.3 sweep.
+
+### Known limitations
+
+- **Encode does not yet implement strict-mode at the index layer.** Passing `knowledge_base_url` to `oddkit_encode` echoes the override in `debug.knowledge_base_url` and honors canon overrides when the target repo has encoding-type docs, but `getIndex` merges baseline entries by design (`arbitrateEntries`: canon overrides baseline, baseline is the floor). A custom `knowledge_base_url` pointing at a repo without encoding-type docs will still return `governance_source: "knowledge_base"` via the default baseline rather than falling through to `"minimal"`. Telemetry_policy's strict mode (via `getFile`'s `skipBaselineFallback` option) is not yet available on `getIndex`. Tracked for the P1.3 sweep.
+
## [0.17.0] - 2026-04-19
### Added
diff --git a/package-lock.json b/package-lock.json
--- a/package-lock.json
+++ b/package-lock.json
@@ -1,12 +1,12 @@
{
"name": "oddkit",
- "version": "0.17.0",
+ "version": "0.18.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "oddkit",
- "version": "0.17.0",
+ "version": "0.18.0",
"license": "MIT",
"dependencies": {
"@modelcontextprotocol/sdk": "^1.0.0",
diff --git a/package.json b/package.json
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
{
"name": "oddkit",
- "version": "0.17.0",
+ "version": "0.18.0",
"description": "Agent-first CLI for ODD-governed repos. Epistemic terrain rendering with portable baseline.",
"type": "module",
"bin": {
diff --git a/workers/package-lock.json b/workers/package-lock.json
--- a/workers/package-lock.json
+++ b/workers/package-lock.json
@@ -1,12 +1,12 @@
{
"name": "oddkit-mcp-worker",
- "version": "0.17.0",
+ "version": "0.18.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "oddkit-mcp-worker",
- "version": "0.17.0",
+ "version": "0.18.0",
"dependencies": {
"agents": "^0.4.1",
"fflate": "^0.8.2",
diff --git a/workers/package.json b/workers/package.json
--- a/workers/package.json
+++ b/workers/package.json
@@ -1,6 +1,6 @@
{
"name": "oddkit-mcp-worker",
- "version": "0.17.0",
+ "version": "0.18.0",
"private": true,
"type": "module",
"scripts": {
diff --git a/workers/src/index.ts b/workers/src/index.ts
--- a/workers/src/index.ts
+++ b/workers/src/index.ts
@@ -303,7 +303,7 @@
},
{
name: "oddkit_encode",
- description: "Structure a decision, insight, or boundary as a durable record. IMPORTANT: This tool returns the structured artifact in the response — it does NOT persist or save it. The caller must save the output to storage. Standard artifact types: Observations (O), Learnings (L), Decisions (D), Constraints (C), Handoffs (H) — OLDC+H. Track OLDC+H continuously — encode what the user shared, encode what you did. Persist at natural breakpoints.",
+ description: "Structure decisions, insights, or boundaries as DOLCHEO artifacts (canon/definitions/dolcheo-vocabulary) — Decisions (D), Observations closed (O), Learnings (L), Constraints (C), Handoffs (H), Encodes (E), Opens (O-open, facet of O). IMPORTANT: does NOT persist — caller must save output to storage. Batch mode: paragraph-split input with optional prefix tags like '[D] body', '[O] body', '[O-open P1] body' returns a per-artifact array. Unprefixed input uses trigger-word classification (back-compat). Response envelope declares governance_source (knowledge_base|minimal) per canon/constraints/core-governance-baseline. Accepts knowledge_base_url to read the encoding-type vocabulary from an alternate knowledge base.",
action: "encode",
schema: {
input: z.string().describe("A decision, insight, or boundary to capture."),
diff --git a/workers/src/orchestrate.ts b/workers/src/orchestrate.ts
--- a/workers/src/orchestrate.ts
+++ b/workers/src/orchestrate.ts
@@ -71,10 +71,17 @@
fields: string[];
title: string;
body: string;
+ // DOLCHEO facet for Open items ([O-open] prefix). Canon-defined variant of
+ // letter O — closed Observation is the default; facet "open" marks forward-
+ // pointing unresolved threads. See canon/definitions/dolcheo-vocabulary.
+ facet?: string;
+ // Priority band for Open items, e.g. "P1", "P2.1". Sub-bands allowed.
+ priority_band?: string;
}
let cachedEncodingTypes: EncodingTypeDef[] | null = null;
let cachedEncodingTypesKnowledgeBaseUrl: string | undefined = undefined;
+let cachedEncodingTypesSource: "knowledge_base" | "minimal" = "minimal";
// Governance-driven challenge types (E0008 — mirrors encode pattern from PR #96)
interface ChallengeTypeDef {
@@ -312,12 +319,23 @@
return { from: "unknown", to: "unknown" };
}
-// Discover encoding types from canon governance docs
+// Discover encoding types from canon governance docs.
+//
+// Governance resolution per canon/constraints/core-governance-baseline:
+// 1. Live knowledge-base fetch (preferred) → governance_source: "knowledge_base"
+// 2. Minimal hardcoded DOLCHEO fallback → governance_source: "minimal"
+//
+// Encoding-types are documented as canon-only (not in the required-baseline
+// manifest), so encode has no "bundled" tier. Degradation is soft: the tool
+// still encodes, with generic-rather-than-type-specific quality scoring.
+// See canon/definitions/dolcheo-vocabulary for the letter registry contract.
async function discoverEncodingTypes(
fetcher: KnowledgeBaseFetcher,
knowledgeBaseUrl?: string,
-): Promise<EncodingTypeDef[]> {
- if (cachedEncodingTypes && cachedEncodingTypesKnowledgeBaseUrl === knowledgeBaseUrl) return cachedEncodingTypes;
+): Promise<{ types: EncodingTypeDef[]; source: "knowledge_base" | "minimal" }> {
+ if (cachedEncodingTypes && cachedEncodingTypesKnowledgeBaseUrl === knowledgeBaseUrl) {
+ return { types: cachedEncodingTypes, source: cachedEncodingTypesSource };
+ }
const index = await fetcher.getIndex(knowledgeBaseUrl);
const typeArticles = index.entries.filter(
@@ -371,27 +389,48 @@
}
}
- if (types.length === 0) {
- // Fallback OLDC+H defaults when no governance docs in canon
+ // Deduplicate by letter: per DOLCHEO, both closed Observation and Open share
+ // letter "O" (with Open distinguished by facet, not letter). If canon contains
+ // multiple `encoding-type`-tagged docs with the same letter (e.g. observation.md
+ // and open.md), keep the first one discovered — the letter registry is
+ // single-character-per-entry.
+ const deduped: EncodingTypeDef[] = [];
+ const seen = new Set<string>();
+ for (const t of types) {
+ if (seen.has(t.letter)) continue;
+ seen.add(t.letter);
+ deduped.push(t);
+ }
+
+ let source: "knowledge_base" | "minimal";
+ let resolved: EncodingTypeDef[];
+ if (deduped.length > 0) {
+ resolved = deduped;
+ source = "knowledge_base";
+ } else {
+ // Minimal DOLCHEO fallback — six letters per canon/definitions/dolcheo-vocabulary.
+ // Open is a facet of O, not a separate letter; the prefix parser surfaces
+ // it via the [O-open] tag. Upgraded from the pre-DOLCHEO 5-letter OLDC+H.
const defaults: Array<[string, string, string[]]> = [
- ["D", "Decision", ["decided", "decision", "chose", "committed to", "going with"]],
+ ["D", "Decision", ["decided", "decision", "chose", "committed to", "going with"]],
["O", "Observation", ["observed", "noticed", "found", "measured", "detected"]],
- ["L", "Learning", ["learned", "realized", "discovered", "turns out", "insight"]],
- ["C", "Constraint", ["must", "must not", "never", "always", "constraint", "cannot"]],
- ["H", "Handoff", ["next session", "next step", "todo", "follow up", "blocked by"]],
+ ["L", "Learning", ["learned", "realized", "discovered", "turns out", "insight"]],
+ ["C", "Constraint", ["must", "must not", "never", "always", "constraint", "cannot"]],
+ ["H", "Handoff", ["next session", "next step", "todo", "follow up", "blocked by"]],
+ ["E", "Encode", ["encoded", "captured", "crystallized", "persisted", "artifact"]],
];
- for (const [letter, name, words] of defaults) {
- types.push({
- letter, name, triggerWords: words,
- triggerRegex: new RegExp("\\b(" + words.join("|") + ")\\b", "i"),
- qualityCriteria: [],
- });
- }
+ resolved = defaults.map(([letter, name, words]) => ({
+ letter, name, triggerWords: words,
+ triggerRegex: new RegExp("\\b(" + words.join("|") + ")\\b", "i"),
+ qualityCriteria: [],
+ }));
+ source = "minimal";
}
- cachedEncodingTypes = types;
+ cachedEncodingTypes = resolved;
cachedEncodingTypesKnowledgeBaseUrl = knowledgeBaseUrl;
- return types;
+ cachedEncodingTypesSource = source;
+ return { types: resolved, source };
}
// ──────────────────────────────────────────────────────────────────────────────
@@ -739,6 +778,107 @@
return lines.length > 0 && lines.every((l) => /^[A-Z]\t/.test(l));
}
+// ──────────────────────────────────────────────────────────────────────────────
+// DOLCHEO prefix-tag batch parser
+//
+// Recognizes paragraph-split input where each paragraph optionally begins with
+// a DOLCHEO letter tag:
+//
+// [D] Decision
+// [O] Observation (closed)
+// [L] Learning
+// [C] Constraint
+// [H] Handoff
+// [E] Encode
+// [O-open] Open item (forward-pointing facet of O)
+// [O-open P1] Open item with priority band
+// [O-open P2.1] Open item with sub-band
+//
+// Per canon/definitions/dolcheo-vocabulary — both Os remain letter O; the
+// -open suffix is a facet, not a new letter. Paragraphs without a recognized
+// prefix are left for the unstructured trigger-word fallback.
+// ──────────────────────────────────────────────────────────────────────────────
+
+// Matches [LETTER] for any DOLCHEO letter (D/O/L/C/H/E), or [O-open] /
+// [O-open P1] / [O-open P2.1] at paragraph start. The -open facet and the
+// priority band are exclusive to the O (Observation) letter per
+// canon/definitions/dolcheo-vocabulary — they are not accepted on other
+// letters. Restricting the letter set to the six DOLCHEO letters also
+// prevents misrouting unstructured input that happens to begin a paragraph
+// with an unrelated bracketed letter (e.g. enumerated points like "[A] ...").
+//
+// Capture groups:
+// 1 — non-O DOLCHEO letter ([DLCHE]) when no facet/band applies
+// 2 — "O" letter when the O branch matches (with optional facet/band)
+// 3 — "open" facet (only on O)
+// 4 — priority band "P1" / "P2.1" (only on O)
+const PREFIX_TAG_REGEX = /^\[(?:([DLCHE])|(O)(?:-(open)(?:\s+(P\d+(?:\.\d+)?))?)?)\]\s*/;
+
+function isPrefixedBatchInput(input: string): boolean {
+ const paragraphs = input.split(/\n\n+/).map((p) => p.trim()).filter((p) => p.length > 0);
+ if (paragraphs.length === 0) return false;
+ // At least one paragraph must carry a prefix tag. Mixed input (some tagged,
+ // some not) routes through this path — untagged paragraphs drop through to
+ // the existing trigger-word classification inside the parser.
+ return paragraphs.some((p) => PREFIX_TAG_REGEX.test(p));
+}
+
+function parsePrefixedBatchInput(input: string, types: EncodingTypeDef[]): ParsedArtifact[] {
+ const typeMap = new Map(types.map((t) => [t.letter, t.name]));
+ const paragraphs = input.split(/\n\n+/).map((p) => p.trim()).filter((p) => p.length > 0);
+ const artifacts: ParsedArtifact[] = [];
+
+ for (const para of paragraphs) {
+ const match = para.match(PREFIX_TAG_REGEX);
+ if (match) {
+ // match[1]: non-O letter ([DLCHE]); match[2]: "O" when O branch matched.
+ // Facet and band are only captured on the O branch — enforced by regex.
+ const letter = match[1] || match[2];
+ const facet = match[3]; // "open" | undefined (O only)
+ const band = match[4]; // "P1" | "P2.1" | undefined (O only)
+ const body = para.slice(match[0].length).trim();
+ const first = body.split(/[.!?\n]/)[0]?.trim() || body.slice(0, 60);
+ const title = first.split(/\s+/).length <= 12
+ ? first
+ : first.split(/\s+/).slice(0, 8).join(" ") + "...";
+ const baseName = typeMap.get(letter) || letter;
+ const typeName = facet === "open" ? `${baseName} (Open)` : baseName;
+ const artifact: ParsedArtifact = {
+ type: letter,
+ typeName,
+ fields: [letter, title, body],
+ title,
+ body,
+ };
+ if (facet) artifact.facet = facet;
+ if (band) artifact.priority_band = band;
+ artifacts.push(artifact);
+ } else {
+ // Untagged paragraph in a batch that contains tags: classify via trigger
+ // words like parseUnstructuredInput, but emit one artifact per paragraph
+ // (not one-per-match) to preserve the author's paragraph boundaries.
+ let matched: EncodingTypeDef | null = null;
+ for (const t of types) {
+ if (t.triggerRegex && t.triggerRegex.test(para)) { matched = t; break; }
+ }
+ const pick = matched ?? types[0] ?? { letter: "D", name: "Decision" };
+ const first = para.split(/[.!?\n]/)[0]?.trim() || para.slice(0, 60);
+ const title = first.split(/\s+/).length <= 12
+ ? first
+ : first.split(/\s+/).slice(0, 8).join(" ") + "...";
+ artifacts.push({
+ type: pick.letter,
+ typeName: pick.name,
+ fields: [pick.letter, title, para],
+ title,
+ body: para,
+ });
+ }
+ }
+
+ return artifacts;
+}
+
function parseStructuredInput(input: string, types: EncodingTypeDef[]): ParsedArtifact[] {
const typeMap = new Map(types.map((t) => [t.letter, t.name]));
return input.split("\n").filter((l) => l.trim().length > 0).map((line) => {
@@ -1119,6 +1259,7 @@
cachedBM25Entries = null;
cachedEncodingTypes = null;
cachedEncodingTypesKnowledgeBaseUrl = undefined;
+ cachedEncodingTypesSource = "minimal";
// E0008 — governance-driven challenge caches (mirror PR #96 fix)
cachedChallengeTypes = null;
cachedChallengeTypesKnowledgeBaseUrl = undefined;
@@ -2035,9 +2176,17 @@
// Do not pass fullInput to parsers — that would create separate artifacts
// for each context paragraph instead of letting context inform scoring.
- const types = await discoverEncodingTypes(fetcher, knowledgeBaseUrl);
- const structured = isStructuredInput(input);
- const artifacts = structured
+ const { types, source: governanceSource } = await discoverEncodingTypes(fetcher, knowledgeBaseUrl);
+
+ // Detection cascade:
+ // 1. DOLCHEO prefix-tagged batch ([D] / [O] / [L] / [C] / [H] / [E] / [O-open]) — batch-mode canary
+ // 2. TSV-structured input (LETTER\tTITLE\tBODY per line) — legacy
+ // 3. Unstructured paragraphs — trigger-word classification
+ const prefixed = isPrefixedBatchInput(input);
+ const structured = !prefixed && isStructuredInput(input);
+ const artifacts = prefixed
+ ? parsePrefixedBatchInput(input, types)
+ : structured
? parseStructuredInput(input, types)
: parseUnstructuredInput(input, types);
@@ -2050,24 +2199,38 @@
const criteria = typeDef ? typeDef.qualityCriteria : [];
const scoringText = context ? `${a.body}\n${context}` : undefined;
const quality = scoreArtifactQuality(a, criteria, scoringText);
- return { title: a.title, type: a.type, typeName: a.typeName, content: a.body, fields: a.fields, quality };
+ const scored: {
+ title: string; type: string; typeName: string; content: string;
+ fields: string[]; quality: ReturnType<typeof scoreArtifactQuality>;
+ facet?: string; priority_band?: string;
+ } = {
+ title: a.title, type: a.type, typeName: a.typeName,
+ content: a.body, fields: a.fields, quality,
+ };
+ if (a.facet) scored.facet = a.facet;
+ if (a.priority_band) scored.priority_band = a.priority_band;
+ return scored;
});
- // Update state — track all encoded type letters
+ // Update state — track all encoded type letters (Open facet uses same letter)
const updatedState = state ? initState(state) : undefined;
if (updatedState) {
for (const a of artifacts) {
- updatedState.decisions_encoded.push(`${a.type}:${a.title}`);
+ const tag = a.facet === "open" ? `${a.type}-open:${a.title}` : `${a.type}:${a.title}`;
+ updatedState.decisions_encoded.push(tag);
}
}
// Build assistant_text as markdown with per-artifact sections
const lines: string[] = [
- `## Encoded ${scoredArtifacts.length} artifact${scoredArtifacts.length !== 1 ? "s" : ""}`,
+ `## Encoded ${scoredArtifacts.length} artifact${scoredArtifacts.length !== 1 ? "s" : ""} (governance: ${governanceSource})`,
"",
];
for (const a of scoredArtifacts) {
- lines.push(`### [${a.type}] ${a.typeName}: ${a.title}`);
+ const header = a.facet === "open"
+ ? `### [${a.type}-open${a.priority_band ? ` ${a.priority_band}` : ""}] ${a.typeName}: ${a.title}`
+ : `### [${a.type}] ${a.typeName}: ${a.title}`;
+ lines.push(header);
lines.push(`**Quality:** ${a.quality.level} (${a.quality.score}/${a.quality.maxScore})`);
lines.push("");
lines.push(a.content);
@@ -2096,12 +2259,18 @@
status: "ENCODED",
artifacts: scoredArtifacts,
governance: types.map((t) => ({ letter: t.letter, name: t.name })),
+ governance_source: governanceSource,
+ governance_uri: "klappy://canon/definitions/dolcheo-vocabulary",
persist_required: true,
next_action: "Save these artifacts to storage. Encode does NOT persist.",
},
state: updatedState,
assistant_text: lines.join("\n").trim(),
- debug: { duration_ms: Date.now() - startMs, generated_at: new Date().toISOString() },
+ debug: {
+ duration_ms: Date.now() - startMs,
+ generated_at: new Date().toISOString(),
+ knowledge_base_url: knowledgeBaseUrl,
+ },
};
}
diff --git a/workers/test/canon-tool-envelope.smoke.mjs b/workers/test/canon-tool-envelope.smoke.mjs
--- a/workers/test/canon-tool-envelope.smoke.mjs
+++ b/workers/test/canon-tool-envelope.smoke.mjs
@@ -129,6 +129,94 @@
`got: ${policyOverride.debug?.knowledge_base_url}`,
);
+ // Tool 4: oddkit_encode — canon-driven, DOLCHEO-aware. Full envelope +
+ // governance_source + DOLCHEO prefix-tag batch mode + Open facet + back-
+ // compat for unprefixed input.
+ console.log(`\n─── oddkit_encode: envelope + governance_source ───`);
+ const encodeSingle = await callTool("oddkit_encode", {
+ input: "decided to ship two-tier cascade because encoding-types are canon-only per the baseline contract",
+ });
+ expectFullEnvelope("oddkit_encode (single unprefixed)", encodeSingle);
+ expectGovernanceSource("oddkit_encode (single unprefixed, default KB)", encodeSingle, "knowledge_base");
+ ok(
+ "oddkit_encode: result.governance_uri points at DOLCHEO canon",
+ encodeSingle.result?.governance_uri === "klappy://canon/definitions/dolcheo-vocabulary",
+ `got: ${encodeSingle.result?.governance_uri}`,
+ );
+ ok(
+ "oddkit_encode: result.artifacts is an array",
+ Array.isArray(encodeSingle.result?.artifacts),
+ `got: ${typeof encodeSingle.result?.artifacts}`,
+ );
+ ok(
+ "oddkit_encode: single unprefixed input returns at least one artifact (backward compat)",
+ (encodeSingle.result?.artifacts?.length ?? 0) >= 1,
+ `got length: ${encodeSingle.result?.artifacts?.length}`,
+ );
+
+ console.log(`\n─── oddkit_encode: DOLCHEO batch-prefix parsing ───`);
+ const encodeBatch = await callTool("oddkit_encode", {
+ input: "[D] picked two-tier cascade because contract classifies encoding-types as canon-only\n\n[O] telemetry_policy canary already declares governance_source\n\n[L] recency of handoff ≠ authority over governing contract",
+ });
+ expectFullEnvelope("oddkit_encode (batch prefix)", encodeBatch);
+ ok(
+ "oddkit_encode: batch of 3 prefixed paragraphs returns exactly 3 artifacts",
+ encodeBatch.result?.artifacts?.length === 3,
+ `got length: ${encodeBatch.result?.artifacts?.length}`,
+ );
+ const batchTypes = (encodeBatch.result?.artifacts ?? []).map((a) => a.type);
+ ok(
+ "oddkit_encode: artifact types match prefix order [D,O,L]",
+ JSON.stringify(batchTypes) === JSON.stringify(["D", "O", "L"]),
+ `got: ${JSON.stringify(batchTypes)}`,
+ );
+
+ console.log(`\n─── oddkit_encode: Open facet + priority band ───`);
+ const encodeOpen = await callTool("oddkit_encode", {
+ input: "[O-open P1] retrofit encode envelope to declare governance_source\n\n[O-open P2.1] correct handoff Tier 2/3 wording in follow-up PR",
+ });
+ expectFullEnvelope("oddkit_encode (O-open with bands)", encodeOpen);
+ const openArtifacts = encodeOpen.result?.artifacts ?? [];
+ ok(
+ "oddkit_encode: [O-open P1] sets facet='open' and priority_band='P1'",
+ openArtifacts[0]?.facet === "open" && openArtifacts[0]?.priority_band === "P1",
+ `got: facet=${openArtifacts[0]?.facet} band=${openArtifacts[0]?.priority_band}`,
+ );
+ ok(
+ "oddkit_encode: sub-band [O-open P2.1] is preserved",
+ openArtifacts[1]?.priority_band === "P2.1",
+ `got: ${openArtifacts[1]?.priority_band}`,
+ );
+ ok(
+ "oddkit_encode: O-open artifacts still use letter 'O' (facet, not separate letter)",
+ openArtifacts.every((a) => a.type === "O"),
+ `got: ${openArtifacts.map((a) => a.type).join(",")}`,
+ );
+
+ console.log(`\n─── oddkit_encode: knowledge_base_url override ───`);
+ const encodeOverride = await callTool("oddkit_encode", {
+ input: "[D] verify override is threaded through to debug envelope",
+ knowledge_base_url: "https://github.com/torvalds/linux",
+ });
+ expectFullEnvelope("oddkit_encode (knowledge_base_url override)", encodeOverride);
+ ok(
+ "oddkit_encode: debug.knowledge_base_url echoes the override",
+ encodeOverride.debug?.knowledge_base_url === "https://github.com/torvalds/linux",
+ `got: ${encodeOverride.debug?.knowledge_base_url}`,
+ );
+ // NOTE: encode does not yet implement strict-mode at the index layer.
+ // getIndex merges canon + baseline entries by design (arbitrateEntries:
+ // canon overrides baseline, baseline is the floor), so an override URL
+ // without encoding-type docs still returns "knowledge_base" via the
+ // default baseline. Strict-mode on getIndex is an explicit follow-up for
+ // the P1.3 sweep — asserting "minimal" here would require that refactor.
+ // For now, we verify the tier value is present and valid.
+ ok(
+ "oddkit_encode: override returns valid governance_source (either knowledge_base via baseline-merge, or minimal)",
+ ["knowledge_base", "minimal"].includes(encodeOverride.result?.governance_source),
+ `got: ${encodeOverride.result?.governance_source}`,
+ );
+
console.log(`\n${passed} passed, ${failed} failed`);
process.exit(failed === 0 ? 0 : 1);
}You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit a4dad69. Configure here.
The priority-band capture was structurally independent of the -open facet group, so input like [O P1] would produce an artifact with priority_band set but no facet. Nesting the band inside the open group enforces the documented contract that bands only apply to [O-open ...] prefixes.
klappy
added a commit
that referenced
this pull request
Apr 19, 2026
) Promote 0.18.0 to prod. Retrofits oddkit_encode to declare governance_source + adds DOLCHEO batch-prefix input. Two-tier cascade per canon/constraints/core-governance-baseline. Main-preview smoke 61/61. Sonnet 4.6 validator VERIFIED on #114.
klappy
added a commit
to klappy/klappy.dev
that referenced
this pull request
Apr 19, 2026
…oseout - odd/ledger/2026-04-19-p1-2-encode-dolcheo-landed.md (new, tier 3) DOLCHEO-format retrospective: what shipped in 0.18.0, timeline of the P1.2 arc (18:32-21:04Z), the recency-as-authority failure pattern that recurred three times, validator VERIFIED 11/11 with external corroboration, open items with priority bands. - odd/handoffs/2026-04-20-p1-3-challenge-canary.md (new, tier 3) Forward-pointing handoff. Points next session at P1.3.1 — retrofit oddkit_challenge to declare governance_source in its envelope. Scope, workflow, standing rules, reference material, thin prompt. - odd/handoffs/2026-04-20-post-closeout.md (superseded) status flipped to superseded; superseded_by pointer added; banner at doc top pointing readers forward. Ref: klappy/oddkit#114 (feat, merged to main, 290dde5) Ref: klappy/oddkit#115 (promotion, merged to prod, e6dbba5) Ref: Sonnet 4.6 validator sesn_011CaDj48ax5VEXyMfxrDves (VERIFIED 11/11)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary — What's shipping
[D]/[O]/[L]/[C]/[H]/[E]/[O-open]/[O-open P1]/[O-open P2.1]at paragraph start. Per-artifact array output preserved. Unprefixed input and TSV input still work unchanged (back-compat).facet: "open"andpriority_bandon artifacts from[O-open ...]prefixes. Omitted for non-Open artifacts so legacy consumer output is identical.governance_source+governance_uriin the encode envelope. Two-tier:knowledge_base(live canon parsed) orminimal(six-letter DOLCHEO fallback).governance_uripoints atklappy://canon/definitions/dolcheo-vocabulary.E(Encode). Open is a facet of O per canon, not a seventh letter.observation.mdandopen.mdclaiming letterO; discovery keeps the first.[...]syntax.Two-tier decision (not three)
The P1.2 handoff described a three-tier cascade (
knowledge_base→bundled→minimal). The governing contract (canon/constraints/core-governance-baseline) defines"bundled"as a snapshot of files listed in the required-baseline MANIFEST — a build-time regeneration step plus schema-check invariant. Encoding-types are explicitly canon-only in that contract ("encode falls back to OLDC+H defaults"). The handoff's "bundled DOLCHEO minimum" is hardcoded constants — which maps to the contract's"minimal"enum, not"bundled". Word collision, not design conflict.If canon-outage telemetry later shows encode users suffering, adding
"bundled"as a third envelope value is additive and non-breaking. For now, two-tier matches the contract.0.17.0 release note correction
The 0.17.0 CHANGELOG entry for "
governance_sourceon refactored tool envelopes" claimed challenge, encode, and telemetry_policy all declared the tier signal. In practice only telemetry_policy did — this release retrofits encode. Challenge remains a P1.3 item.Known limitation
Encode does not yet implement strict-mode at the index layer.
getIndexmerges baseline + canon entries by design (arbitrateEntries: canon overrides baseline, baseline is the floor), so a customknowledge_base_urlwithout encoding-type docs still returnsgovernance_source: "knowledge_base"via the default baseline rather than falling through to"minimal". Telemetry_policy's strict mode usesskipBaselineFallbackongetFile;getIndexlacks that option today. Tracked for P1.3.Evidence
npm run typecheckinworkers/)node workers/test/governance-parser.test.mjs)https://encode-batch-mode-and-canary-refactor-oddkit.klappy.workers.dev/mcp0.18.0reported by/healthat the branch preview412fcd1(feat) +edf263e(test softening + CHANGELOG note)References
klappy://odd/handoffs/2026-04-20-p1-2-encode-canaryklappy://canon/definitions/dolcheo-vocabularyklappy://canon/constraints/core-governance-baselineklappy://odd/ledger/2026-04-19-validator-closeout-and-0.17.0(prior session)klappy/oddkit#108/#109(telemetry_policy canary reference shape)klappy/oddkit#96(encode's original canary work)Note
Medium Risk
Changes
oddkit_encodeparsing and response shape (new batch-prefix mode plus optionalfacet/priority_bandfields and new envelope metadata), which could affect downstream consumers and state tracking if they assume the old format.Overview
oddkit_encodenow supports DOLCHEO paragraph-prefix batch input ([D],[O],[L],[C],[H],[E], plus[O-open (P…)]), emitting one artifact per paragraph and preserving backward compatibility for TSV and unprefixed inputs.Encode responses now declare governance provenance via
result.governance_source(knowledge_basevsminimal) andresult.governance_uri, upgrade the minimal fallback vocabulary to 6-letter DOLCHEO, and dedupe discovered encoding types by letter to handle canon duplicates. Open items are surfaced astype: "O"with optionalfacet: "open"andpriority_band.Bumps version to
0.18.0, updates theoddkit_encodetool description, and extends the live smoke test to assert the new encode envelope fields and batch/Open behavior.Reviewed by Cursor Bugbot for commit 03dcf09. Bugbot is set up for automated code reviews on this repo. Configure here.