feat: governance-driven encode architecture by klappy · Pull Request #96 · klappy/oddkit

klappy · 2026-04-16T01:15:56Z

E0008: Replace hardcoded detectEncodeType with governance-driven encoding.

Server searches canon for encoding-type tagged docs, extracts field schemas/trigger words/quality criteria, parses structured (TSV) and unstructured input, scores per-type, teaches the model via response.

Key changes:

loadEncodingTypes() searches canon for encoding-type docs and caches results
detectEncodeTypeFromGovernance() replaces hardcoded regex
runEncodeAction() produces per-type artifacts instead of single blob
Response includes governance definitions to teach calling model

Note

Medium Risk
Changes the encode response shape and classification/scoring behavior, and adds runtime parsing of canon governance docs (regex/table parsing), which could affect downstream callers and edge-case inputs.

Overview
encode is refactored to be governance-driven: the worker now discovers encoding types from canon docs tagged encoding-type (with caching + OLDC+H fallback), and uses those definitions (trigger words + quality criteria) to classify and score artifacts.

runEncodeAction now supports both structured TSV and unstructured text, can emit multiple typed artifacts per input, returns per-artifact quality/gaps/suggestions, includes the active governance type list in the response, and clears the new encoding-type cache during cleanup_storage.

Also bumps package version to 0.16.0.

^{Reviewed by Cursor Bugbot for commit e7a80b4. Bugbot is set up for automated code reviews on this repo. Configure here.}

cloudflare-workers-and-pages · 2026-04-16T01:16:26Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	oddkit	`e7a80b4`	Commit Preview URL Branch Preview URL	Apr 16 2026, 01:49 AM

E0008: Replace hardcoded detectEncodeType with governance-driven encoding. Server discovers encoding-type docs from canon via tag search, extracts field schemas/trigger words/quality criteria, parses structured (TSV) and unstructured input, scores per-type, teaches model via response.

…encode action - Add break after first type match in parseUnstructuredInput to prevent duplicate artifacts when a paragraph matches multiple encoding types - Key cachedEncodingTypes by canonUrl so different canon sources get separate cached encoding types within the same isolate - Remove unused detectEncodeTypeFromGovernance function (dead code) - Fix scoreArtifactQuality: require score >= mx (not mx-1) for strong so a 0/1 score no longer rates as strong - Fix misleading DOLCHE comment to OLDC+H (matches actual 5-type system)

Reverts bugbot's break; addition in parseUnstructuredInput. Multiple type matches per paragraph is an intentional design decision: a paragraph can be both Decision and Constraint simultaneously. This mirrors what the model would do with separate TSV rows. Added inline comment to prevent automated regression.

…acts, cache cleanup - Use first discovered governance type as fallback instead of hardcoded D/Decision for unmatched paragraphs in parseUnstructuredInput - Pass only input (not fullInput) to parseUnstructuredInput so context paragraphs are not encoded as standalone artifacts - Clear cachedEncodingTypes and cachedEncodingTypesCanonUrl in runCleanupStorage so governance doc updates take effect without worker isolate recycle

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Context parameter silently dropped in encode action
- Added fullInput merging (context appended to input) in runEncodeAction and passed it to isStructuredInput, parseStructuredInput, and parseUnstructuredInput, matching the pattern already used by runGateAction.

Preview (e7a80b4c14)

diff --git a/package-lock.json b/package-lock.json
--- a/package-lock.json
+++ b/package-lock.json
@@ -1,12 +1,12 @@
 {
   "name": "oddkit",
-  "version": "0.15.0",
+  "version": "0.16.0",
   "lockfileVersion": 3,
   "requires": true,
   "packages": {
     "": {
       "name": "oddkit",
-      "version": "0.15.0",
+      "version": "0.16.0",
       "license": "MIT",
       "dependencies": {
         "@modelcontextprotocol/sdk": "^1.0.0",

diff --git a/workers/src/orchestrate.ts b/workers/src/orchestrate.ts
--- a/workers/src/orchestrate.ts
+++ b/workers/src/orchestrate.ts
@@ -55,6 +55,26 @@
 /** Internal type — handlers return this, handleUnifiedAction stamps server_time */
 type ActionResult = Omit<OddkitEnvelope, "server_time">;
 
+// Governance-driven encoding types
+interface EncodingTypeDef {
+  letter: string;
+  name: string;
+  triggerWords: string[];
+  triggerRegex: RegExp | null;
+  qualityCriteria: Array<{ criterion: string; check: string; gapMessage: string }>;
+}
+
+interface ParsedArtifact {
+  type: string;
+  typeName: string;
+  fields: string[];
+  title: string;
+  body: string;
+}
+
+let cachedEncodingTypes: EncodingTypeDef[] | null = null;
+let cachedEncodingTypesCanonUrl: string | undefined = undefined;
+
 export interface UnifiedParams {
   action: string;
   input: string;
@@ -253,18 +273,166 @@
   return { from: "unknown", to: "unknown" };
 }
 
-function detectEncodeType(input: string): string {
-  if (/\b(decided|decision|chose|choosing|selected|committed to|going with)\b/i.test(input))
-    return "decision";
-  if (/\b(learned|insight|realized|discovered|found that|turns out)\b/i.test(input))
-    return "insight";
-  if (/\b(boundary|limit|constraint|rule|prohibition|must not|never)\b/i.test(input))
-    return "boundary";
-  if (/\b(override|exception|despite|even though|notwithstanding)\b/i.test(input))
-    return "override";
-  return "decision";
+// Discover encoding types from canon governance docs
+async function discoverEncodingTypes(
+  fetcher: ZipBaselineFetcher,
+  canonUrl?: string,
+): Promise<EncodingTypeDef[]> {
+  if (cachedEncodingTypes && cachedEncodingTypesCanonUrl === canonUrl) return cachedEncodingTypes;
+
+  const index = await fetcher.getIndex(canonUrl);
+  const typeArticles = index.entries.filter(
+    (entry: IndexEntry) => entry.tags?.includes("encoding-type") && entry.path.includes("encoding-types/"),
+  );
+
+  const types: EncodingTypeDef[] = [];
+  for (const article of typeArticles) {
+    try {
+      const content = await fetcher.getFile(article.path, canonUrl);
+      if (!content) continue;
+
+      const identityMatch = content.match(/\|\s*Letter\s*\|\s*([A-Z])\s*\|/);
+      const nameMatch = content.match(/\|\s*Name\s*\|\s*([^|]+)\s*\|/);
+      if (!identityMatch) continue;
+
+      const letter = identityMatch[1];
+      const name = nameMatch ? nameMatch[1].trim() : letter;
+
+      const triggerSection = content.match(
+        /## Trigger Words[^\n]*\n[\s\S]*?```\n([\s\S]*?)\n```/,
+      );
+      const triggerWords = triggerSection
+        ? triggerSection[1].split(",").map((w: string) => w.trim()).filter((w: string) => w.length > 0)
+        : [];
+      const triggerRegex =
+        triggerWords.length > 0
+          ? new RegExp("\\b(" + triggerWords.map((w: string) => w.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")).join("|") + ")\\b", "i")
+          : null;
+
+      const criteriaSection = content.match(
+        /## Quality Criteria[\s\S]*?\| Criterion[\s\S]*?\|[-|\s]+\|\n([\s\S]*?)(?=\n\n|\n##|$)/,
+      );
+      const qualityCriteria: Array<{ criterion: string; check: string; gapMessage: string }> = [];
+      if (criteriaSection) {
+        for (const row of criteriaSection[1].split("\n").filter((r: string) => r.includes("|"))) {
+          const cols = row.split("|").map((c: string) => c.trim()).filter((c: string) => c.length > 0);
+          if (cols.length >= 3) {
+            qualityCriteria.push({
+              criterion: cols[0],
+              check: cols[1],
+              gapMessage: cols[2].replace(/^"|"$/g, ""),
+            });
+          }
+        }
+      }
+
+      types.push({ letter, name, triggerWords, triggerRegex, qualityCriteria });
+    } catch {
+      continue;
+    }
+  }
+
+  if (types.length === 0) {
+    // Fallback OLDC+H defaults when no governance docs in canon
+    const defaults: Array<[string, string, string[]]> = [
+      ["D", "Decision", ["decided", "decision", "chose", "committed to", "going with"]],
+      ["O", "Observation", ["observed", "noticed", "found", "measured", "detected"]],
+      ["L", "Learning", ["learned", "realized", "discovered", "turns out", "insight"]],
+      ["C", "Constraint", ["must", "must not", "never", "always", "constraint", "cannot"]],
+      ["H", "Handoff", ["next session", "next step", "todo", "follow up", "blocked by"]],
+    ];
+    for (const [letter, name, words] of defaults) {
+      types.push({
+        letter, name, triggerWords: words,
+        triggerRegex: new RegExp("\\b(" + words.join("|") + ")\\b", "i"),
+        qualityCriteria: [],
+      });
+    }
+  }
+
+  cachedEncodingTypes = types;
+  cachedEncodingTypesCanonUrl = canonUrl;
+  return types;
 }
 
+function isStructuredInput(input: string): boolean {
+  const lines = input.split("\n").filter((l) => l.trim().length > 0);
+  return lines.length > 0 && lines.every((l) => /^[A-Z]\t/.test(l));
+}
+
+function parseStructuredInput(input: string, types: EncodingTypeDef[]): ParsedArtifact[] {
+  const typeMap = new Map(types.map((t) => [t.letter, t.name]));
+  return input.split("\n").filter((l) => l.trim().length > 0).map((line) => {
+    const fields = line.split("\t");
+    const letter = fields[0]?.trim() || "D";
+    return {
+      type: letter, typeName: typeMap.get(letter) || letter,
+      fields, title: fields[1]?.trim() || "", body: fields[2]?.trim() || "",
+    };
+  });
+}
+
+function parseUnstructuredInput(input: string, types: EncodingTypeDef[]): ParsedArtifact[] {
+  const paragraphs = input.split(/\n\n+/).filter((p) => p.trim().length > 0);
+  const artifacts: ParsedArtifact[] = [];
+  for (const para of paragraphs) {
+    let matched = false;
+    for (const t of types) {
+      // DESIGN: no break — a paragraph can match multiple types intentionally.
+      // "We must never deploy without tests" is both Decision and Constraint.
+      // Multi-typing at the server level mirrors what the model would do with
+      // separate TSV rows. Do not add a break here.
+      if (t.triggerRegex && t.triggerRegex.test(para)) {
+        const first = para.split(/[.!?\n]/)[0]?.trim() || para.slice(0, 60);
+        const title = first.split(/\s+/).length <= 12 ? first : first.split(/\s+/).slice(0, 8).join(" ") + "...";
+        artifacts.push({ type: t.letter, typeName: t.name, fields: [t.letter, title, para.trim()], title, body: para.trim() });
+        matched = true;
+      }
+    }
+    if (!matched) {
+      const first = para.split(/[.!?\n]/)[0]?.trim() || para.slice(0, 60);
+      const title = first.split(/\s+/).length <= 12 ? first : first.split(/\s+/).slice(0, 8).join(" ") + "...";
+      const fallback = types[0] || { letter: "D", name: "Decision" };
+      artifacts.push({ type: fallback.letter, typeName: fallback.name, fields: [fallback.letter, title, para.trim()], title, body: para.trim() });
+    }
+  }
+  return artifacts;
+}
+
+function scoreArtifactQuality(
+  artifact: ParsedArtifact,
+  criteria: Array<{ criterion: string; check: string; gapMessage: string }>,
+): { score: number; maxScore: number; level: string; gaps: string[]; suggestions: string[] } {
+  const gaps: string[] = [];
+  const suggestions: string[] = [];
+  let score = 0;
+
+  if (criteria.length === 0) {
+    if (artifact.body.split(/\s+/).length >= 10) score++;
+    else suggestions.push("Expand — more detail improves quality");
+    if (/because|due to|since/i.test(artifact.body)) score++;
+    else suggestions.push("Add rationale");
+    return { score, maxScore: 2, level: score >= 2 ? "adequate" : "weak", gaps, suggestions };
+  }
+
+  for (const c of criteria) {
+    const ck = c.check.toLowerCase();
+    let passed = false;
+    if (ck.includes("non-empty")) passed = artifact.fields.length > 3 || artifact.body.length > 0;
+    else if (ck.includes("10")) passed = artifact.body.split(/\s+/).length >= 10;
+    else if (ck.includes("number") || ck.includes("concrete")) passed = /\d/.test(artifact.body);
+    else if (ck.includes("interpretation") || ck.includes("does not contain")) passed = !/should|better|worse|means|implies/i.test(artifact.body);
+    else if (ck.includes("prohibition") || ck.includes("requirement")) passed = /must|must not|never|always|shall/i.test(artifact.body);
+    else passed = artifact.body.split(/\s+/).length >= 5;
+    if (passed) score++;
+    else { gaps.push(c.gapMessage); suggestions.push(c.gapMessage); }
+  }
+
+  const mx = criteria.length;
+  const level = score >= mx ? "strong" : score >= Math.ceil(mx * 0.6) ? "adequate" : score >= Math.ceil(mx * 0.4) ? "weak" : "insufficient";
+  return { score, maxScore: mx, level, gaps, suggestions };
+}
+
 // ──────────────────────────────────────────────────────────────────────────────
 // Score entries (legacy, kept for backward-compat in existing action handlers)
 // ──────────────────────────────────────────────────────────────────────────────
@@ -563,6 +731,8 @@
   // Also clear the in-memory BM25 index
   cachedBM25Index = null;
   cachedBM25Entries = null;
+  cachedEncodingTypes = null;
+  cachedEncodingTypesCanonUrl = undefined;
 
   return {
     action: "cleanup_storage",
@@ -1246,93 +1416,66 @@
 ): Promise<ActionResult> {
   const startMs = Date.now();
   const fullInput = context ? `${input}\n${context}` : input;
-  const encodeType = detectEncodeType(input);
 
-  const firstSentence = input.split(/[.!?\n]/)[0]?.trim() || input.slice(0, 60);
-  const title =
-    firstSentence.split(/\s+/).length <= 12
-      ? firstSentence
-      : firstSentence.split(/\s+/).slice(0, 8).join(" ") + "...";
+  const types = await discoverEncodingTypes(fetcher, canonUrl);
+  const structured = isStructuredInput(fullInput);
+  const artifacts = structured
+    ? parseStructuredInput(fullInput, types)
+    : parseUnstructuredInput(fullInput, types);
 
-  let rationale: string | null = null;
-  const rMatch =
-    fullInput.match(/because\s+(.+?)(?:\.|$)/i) || fullInput.match(/due to\s+(.+?)(?:\.|$)/i);
-  if (rMatch && rMatch[1].split(/\s+/).length >= 3) rationale = rMatch[1].trim();
+  // Score each artifact using its type's quality criteria
+  const scoredArtifacts = artifacts.map((a) => {
+    const typeDef = types.find((t) => t.letter === a.type);
+    const criteria = typeDef ? typeDef.qualityCriteria : [];
+    const quality = scoreArtifactQuality(a, criteria);
+    return { title: a.title, type: a.type, typeName: a.typeName, content: a.body, fields: a.fields, quality };
+  });
 
-  const constraints: string[] = [];
-  for (const s of fullInput.split(/[.!?\n]+/).filter((s) => s.trim().length > 5)) {
-    if (/\b(must|shall|required|always|never|constraint|cannot)\b/i.test(s))
-      constraints.push(s.trim());
-  }
-
-  let score = 0;
-  if (input.split(/\s+/).length >= 10) score++;
-  if (rationale) score++;
-  if (constraints.length > 0) score++;
-  if (/\b(alternative|instead|option|versus|vs|rather than)\b/i.test(fullInput)) score++;
-  if (/\b(irreversib|reversib|temporary|permanent|until)\b/i.test(fullInput)) score++;
-  const qualityLevel =
-    score >= 4 ? "strong" : score >= 3 ? "adequate" : score >= 2 ? "weak" : "insufficient";
-
-  const gaps: string[] = [];
-  const suggestions: string[] = [];
-  if (!rationale) {
-    gaps.push("No rationale detected — add 'because...'");
-    suggestions.push("Add explicit rationale");
-  }
-  if (constraints.length === 0)
-    suggestions.push("Add constraints: what boundaries does this create?");
-  if (encodeType === "decision" && !/\b(alternative|instead)\b/i.test(fullInput))
-    suggestions.push("Document alternatives considered");
-  if (!/\b(irreversib|reversib|temporary|permanent)\b/i.test(fullInput))
-    suggestions.push("Note whether this is reversible or permanent");
-
-  const artifact = {
-    title,
-    type: encodeType,
-    decision: input.trim(),
-    rationale: rationale || "(not provided — add 'because...' to strengthen)",
-    constraints,
-    status: qualityLevel === "strong" || qualityLevel === "adequate" ? "recorded" : "draft",
-    timestamp: new Date().toISOString(),
-  };
-
-  // Update state
+  // Update state — track all encoded type letters
   const updatedState = state ? initState(state) : undefined;
   if (updatedState) {
-    updatedState.decisions_encoded.push(title);
+    for (const a of artifacts) {
+      updatedState.decisions_encoded.push(`${a.type}:${a.title}`);
+    }
   }
 
-  const lines = [
-    `Encoded ${encodeType}: ${title}`,
-    `Status: ${artifact.status} | Quality: ${qualityLevel} (${score}/5)`,
+  // Build assistant_text as markdown with per-artifact sections
+  const lines: string[] = [
+    `## Encoded ${scoredArtifacts.length} artifact${scoredArtifacts.length !== 1 ? "s" : ""}`,
     "",
   ];
-  lines.push(`Decision: ${input.trim()}`, `Rationale: ${artifact.rationale}`, "");
-  if (constraints.length > 0) {
-    lines.push("Constraints:");
-    for (const c of constraints) lines.push(`  - ${c}`);
+  for (const a of scoredArtifacts) {
+    lines.push(`### [${a.type}] ${a.typeName}: ${a.title}`);
+    lines.push(`**Quality:** ${a.quality.level} (${a.quality.score}/${a.quality.maxScore})`);
     lines.push("");
-  }
-  if (gaps.length > 0) {
-    lines.push("Gaps:");
-    for (const g of gaps) lines.push(`  - ${g}`);
+    lines.push(a.content);
     lines.push("");
+    if (a.quality.gaps.length > 0) {
+      lines.push("**Gaps:**");
+      for (const g of a.quality.gaps) lines.push(`- ${g}`);
+      lines.push("");
+    }
+    if (a.quality.suggestions.length > 0) {
+      lines.push("**Suggestions:**");
+      for (const s of a.quality.suggestions) lines.push(`- ${s}`);
+      lines.push("");
+    }
   }
-  if (suggestions.length > 0) {
-    lines.push("Suggestions:");
-    for (const s of suggestions) lines.push(`  - ${s}`);
-    lines.push("");
+
+  lines.push("---");
+  lines.push("**Encoding types (governance):**");
+  for (const t of types) {
+    lines.push(`- **${t.letter}** — ${t.name}`);
   }
 
   return {
     action: "encode",
     result: {
       status: "ENCODED",
-      artifact,
-      quality: { level: qualityLevel, score, max_score: 5, gaps, suggestions },
+      artifacts: scoredArtifacts,
+      governance: types.map((t) => ({ letter: t.letter, name: t.name })),
       persist_required: true,
-      next_action: "Save this artifact to the project's storage (project journal, file, database). Encode does NOT persist.",
+      next_action: "Save these artifacts to storage. Encode does NOT persist.",
     },
     state: updatedState,
     assistant_text: lines.join("\n").trim(),

_{You can send follow-ups to the cloud agent here.}

^{Reviewed by Cursor Bugbot for commit 121dd07. Configure here.}

…Action pattern The context parameter was accepted but never used in runEncodeAction, silently discarding supplementary context provided by callers. This restores the fullInput merging pattern used by runGateAction so that context is included in type detection, artifact parsing, and quality scoring.

Four governance bugs produced four code bugs in PR klappy/oddkit#96. The server had no governance to follow for these cases and improvised. Fixes: 1. Removed bare 'found' from observation trigger words — collided with Learning's 'found that' during fallback regex classification 2. Added fallback: true to observation.md frontmatter — observation is the canonical fallback for unmatched paragraphs 3. Added ## Fallback Behavior section to how-to-write-encoding-types.md — specifies how fallback type is resolved via frontmatter 4. Added ## Scoring Algorithm section to how-to-write-encoding-types.md — centrally defines score→level mapping (strong=max, adequate=60%+, weak=40%+, insufficient<40%) 5. Added ## Context vs Input section to how-to-write-encoding-types.md — input generates artifacts, context only informs quality scoring Prompt over code requires complete governance. These gaps caused the server to improvise, and bugbot caught the improvisations.

…98) Four governance bugs produced four code bugs in PR klappy/oddkit#96. The server had no governance to follow for these cases and improvised. Fixes: 1. Removed bare 'found' from observation trigger words — collided with Learning's 'found that' during fallback regex classification 2. Added fallback: true to observation.md frontmatter — observation is the canonical fallback for unmatched paragraphs 3. Added ## Fallback Behavior section to how-to-write-encoding-types.md — specifies how fallback type is resolved via frontmatter 4. Added ## Scoring Algorithm section to how-to-write-encoding-types.md — centrally defines score→level mapping (strong=max, adequate=60%+, weak=40%+, insufficient<40%) 5. Added ## Context vs Input section to how-to-write-encoding-types.md — input generates artifacts, context only informs quality scoring Prompt over code requires complete governance. These gaps caused the server to improvise, and bugbot caught the improvisations. Co-authored-by: Claude (oddkit project) <chris@klapp.dev>

Server now follows klappy://odd/encoding-types/how-to-write-encoding-types section 'Context vs Input': input generates artifacts; context only informs quality scoring. Before this change, fullInput (input + context) was passed to both the parser and the scorer, causing context paragraphs to become separate standalone artifacts. The governance says context is metadata, not content. Changes: - runEncodeAction: parsers receive input only (not fullInput) - runEncodeAction: scoring receives input + context per artifact so background information still counts toward quality - scoreArtifactQuality: accepts optional scoringText parameter that defaults to artifact.body when not provided - Inline comments cite the governance doc to prevent regression Closes the gap between governance and code surfaced during PR #96 testing.

Mirrors the PR #96 encode pattern. Extracts challenge behavior from live governance articles (landed in klappy.dev canon via PR #99) rather than hardcoded source logic. New functions in workers/src/orchestrate.ts: - discoverChallengeTypes — per-canonUrl cached type discovery - fetchBasePrerequisites — universal prerequisite checks - fetchNormativeVocabulary — RFC 2119 + architectural load-bearing terms - fetchStakesCalibration — mode-to-depth filter - extractPrereqTable / extractKeywordsFromCheck — shared helpers Refactored: - runChallengeAction — replaces hardcoded detectClaimType / generateChallenges / findTensions / findMissingPrerequisites with governance extraction. Supports multi-match. Filters output by stakes calibration based on mode parameter. - runCleanupStorage — clears all four new caches on invalidation Invariant: voice-dump mode suppresses all challenge output regardless of matched types. Load-bearing per stakes-calibration governance — some modes exist for raw capture and pressure-testing at that stage damages the mode. Graceful degradation: missing governance articles fall back to minimal built-in behavior with warnings, rather than failing. Co-authored-by: Claude <noreply@anthropic.com>

Refactor runChallengeAction in workers/src/orchestrate.ts to extract challenge-type behavior from canon governance articles at runtime rather than hardcoding claim-type detection, questions, prerequisites, and tension rules in source. Structural mirror of PR #96 (encode). Detection upgraded mid-implementation from regex-OR to BM25 + stemming after the gauntlet revealed that regex-based matching was morphologically brittle ("coin" doesn't match trigger "coining"). The pivot removed an entire class of bug and seeded a reusable pattern for future governance-driven tools. Changes in workers/src/orchestrate.ts: - New: ChallengeTypeDef, BasePrerequisite, NormativeVocabulary, StakesModeConfig, StakesCalibration - New: discoverChallengeTypes (builds per-canonUrl BM25 index over detection text), fetchBasePrerequisites, fetchNormativeVocabulary, fetchStakesCalibration — each with per-canonUrl cache and graceful degradation on missing articles - New: evaluatePrerequisiteCheck — interprets natural-language check strings from prerequisite overlay tables - Refactored runChallengeAction: multi-match via BM25 score > 0, base + overlay prerequisite aggregation, stakes calibration filtering, voice-dump suppression invariant, governance-driven tension detection - Extended runCleanupStorage with five new cache clears (types, type-index, base prerequisites, vocabulary, calibration) - Removed dead detectClaimType (legacy src/tasks/challenge.js retains its copy for CLI backward-compat) - Added CHALLENGE_STOP_WORDS set preserving modal verbs as signal Changes in workers/src/bm25.ts (backward-compatible extension): - tokenize(), buildBM25Index() accept optional stopWords: Set<string> - BM25Index gains optional stopWords field so searchBM25 tokenizes queries consistently with the index - Default behavior unchanged — existing callers unaffected - Motivation: default STOP_WORDS filters modals (must, should, shall, may, not) which are signal for challenge-type detection New tests: workers/test/governance-parser.test.mjs — 94 assertions against live governance articles fetched from klappy.dev raw. Covers type parsing, fallback resolution, BM25 detection, stemming regression cases (coin/coining, propose/proposed, principle/principles), multi- match, and the voice-dump suppression invariant. 94/94 pass. Bugs the gauntlet caught on this PR: 1. Voice-dump suppression invariant would have shipped broken — the calibration cell reads "none (suppress all challenge)" not bare "none". Strict-equality parser would have produced a single-element array, voice-dump mode would have surfaced all challenges in prod. 2. Morphological brittleness in regex detection (coin vs coining) — triggered the pivot to BM25 + stemming. 3. Default BM25 STOP_WORDS silently breaks strong-claim and proposal detection by filtering modal verbs. Fixed via custom stop word set. Verification: - npm run typecheck: clean - tests/smoke.sh: 6/6 pass (legacy CLI path — backward compat preserved) - workers/test/governance-parser.test.mjs: 94/94 pass - AI voice clichés audit on new comments: clean - oddkit_preflight, challenge, gate, validate: all run; gate NOT_READY due to same hardcoded-logic gap as challenge pre-refactor (flagged as follow-up) Response shape change: adds mode, matched_types, type_definitions, block_until_addressed; removes claim_type. Consumed programmatically, not rendered. Follow-ups flagged: - Encode parity PR — same regex-OR brittleness in runEncodeAction; pattern proven here, port will be near-mechanical - klappy.dev meta governance PR — "compiles into a case-insensitive word-boundary regex" is now stale language - Gate refactor candidate — same hardcoded-logic shape as challenge pre-refactor Refs: - Depends on: klappy/klappy.dev#99 (governance articles this code reads) - Structural mirror: #96 (governance-driven encode) - Evidence: docs/oddkit/evidence/challenge-governance-code-refactor.md

Retrofits oddkit_encode to the envelope contract established by the telemetry_policy canary (canon/constraints/core-governance-baseline) and adds the DOLCHEO vocabulary features that postdate encode's original canary refactor (PR #96). Two-tier cascade — encoding-types are canon-only per the baseline contract, not required-baseline. Version bump to 0.18.0 MINOR (additive). Branch-preview smoke: 61/61 pass. Sonnet 4.6 validator: VERIFIED, 11/11 checks, 3 non-blocking advisories. Ref: klappy://odd/handoffs/2026-04-20-p1-2-encode-canary Ref: klappy://canon/definitions/dolcheo-vocabulary Ref: klappy://canon/constraints/core-governance-baseline

klappy force-pushed the feat/governance-driven-encode branch from eabdfa3 to a1af878 Compare April 16, 2026 01:21

cursor Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread workers/src/orchestrate.ts

Comment thread workers/src/orchestrate.ts

Comment thread workers/src/orchestrate.ts Outdated

Comment thread workers/src/orchestrate.ts

Comment thread workers/src/orchestrate.ts Outdated

cursor Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread workers/src/orchestrate.ts Outdated

Comment thread workers/src/orchestrate.ts

Comment thread workers/src/orchestrate.ts

Claude (oddkit project) and others added 2 commits April 16, 2026 01:39

cursor Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread workers/src/orchestrate.ts Outdated

klappy merged commit bef6c83 into main Apr 16, 2026
5 checks passed

klappy mentioned this pull request Apr 16, 2026

fix: governance gaps surfaced by encode implementation bugbot review klappy/klappy.dev#98

Merged

This was referenced Apr 16, 2026

fix: encode honors context-vs-input governance distinction #98

Merged

promote: main → prod (E0008 encode architecture + context handling) #99

Merged

klappy mentioned this pull request Apr 17, 2026

feat(challenge): governance-driven runChallengeAction (E0008) #100

Merged

This was referenced Apr 17, 2026

Promote PR #100: governance-driven challenge with BM25 + stemming to prod #101

Merged

feat(encode): DOLCHEO batch prefix + governance_source envelope (0.18.0) #114

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: governance-driven encode architecture#96

feat: governance-driven encode architecture#96
klappy merged 5 commits intomainfrom
feat/governance-driven-encode

klappy commented Apr 16, 2026 •

edited by cursor Bot

Loading

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

klappy commented Apr 16, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

klappy commented Apr 16, 2026 •

edited by cursor Bot

Loading

cloudflare-workers-and-pages Bot commented Apr 16, 2026 •

edited

Loading

cursor Bot left a comment •

edited

Loading