Skip to content

feat: governance-driven encode architecture#96

Merged
klappy merged 5 commits intomainfrom
feat/governance-driven-encode
Apr 16, 2026
Merged

feat: governance-driven encode architecture#96
klappy merged 5 commits intomainfrom
feat/governance-driven-encode

Conversation

@klappy
Copy link
Copy Markdown
Owner

@klappy klappy commented Apr 16, 2026

E0008: Replace hardcoded detectEncodeType with governance-driven encoding.

Server searches canon for encoding-type tagged docs, extracts field schemas/trigger words/quality criteria, parses structured (TSV) and unstructured input, scores per-type, teaches the model via response.

Key changes:

  • loadEncodingTypes() searches canon for encoding-type docs and caches results
  • detectEncodeTypeFromGovernance() replaces hardcoded regex
  • runEncodeAction() produces per-type artifacts instead of single blob
  • Response includes governance definitions to teach calling model

Note

Medium Risk
Changes the encode response shape and classification/scoring behavior, and adds runtime parsing of canon governance docs (regex/table parsing), which could affect downstream callers and edge-case inputs.

Overview
encode is refactored to be governance-driven: the worker now discovers encoding types from canon docs tagged encoding-type (with caching + OLDC+H fallback), and uses those definitions (trigger words + quality criteria) to classify and score artifacts.

runEncodeAction now supports both structured TSV and unstructured text, can emit multiple typed artifacts per input, returns per-artifact quality/gaps/suggestions, includes the active governance type list in the response, and clears the new encoding-type cache during cleanup_storage.

Also bumps package version to 0.16.0.

Reviewed by Cursor Bugbot for commit e7a80b4. Bugbot is set up for automated code reviews on this repo. Configure here.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 16, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
oddkit e7a80b4 Commit Preview URL

Branch Preview URL
Apr 16 2026, 01:49 AM

E0008: Replace hardcoded detectEncodeType with governance-driven encoding.
Server discovers encoding-type docs from canon via tag search,
extracts field schemas/trigger words/quality criteria,
parses structured (TSV) and unstructured input,
scores per-type, teaches model via response.
@klappy klappy force-pushed the feat/governance-driven-encode branch from eabdfa3 to a1af878 Compare April 16, 2026 01:21
Comment thread workers/src/orchestrate.ts
Comment thread workers/src/orchestrate.ts
Comment thread workers/src/orchestrate.ts Outdated
Comment thread workers/src/orchestrate.ts
Comment thread workers/src/orchestrate.ts Outdated
…encode action

- Add break after first type match in parseUnstructuredInput to prevent
  duplicate artifacts when a paragraph matches multiple encoding types
- Key cachedEncodingTypes by canonUrl so different canon sources get
  separate cached encoding types within the same isolate
- Remove unused detectEncodeTypeFromGovernance function (dead code)
- Fix scoreArtifactQuality: require score >= mx (not mx-1) for strong
  so a 0/1 score no longer rates as strong
- Fix misleading DOLCHE comment to OLDC+H (matches actual 5-type system)
Comment thread workers/src/orchestrate.ts Outdated
Comment thread workers/src/orchestrate.ts
Comment thread workers/src/orchestrate.ts
Claude (oddkit project) and others added 2 commits April 16, 2026 01:39
Reverts bugbot's break; addition in parseUnstructuredInput.
Multiple type matches per paragraph is an intentional design decision:
a paragraph can be both Decision and Constraint simultaneously.
This mirrors what the model would do with separate TSV rows.

Added inline comment to prevent automated regression.
…acts, cache cleanup

- Use first discovered governance type as fallback instead of hardcoded D/Decision
  for unmatched paragraphs in parseUnstructuredInput
- Pass only input (not fullInput) to parseUnstructuredInput so context paragraphs
  are not encoded as standalone artifacts
- Clear cachedEncodingTypes and cachedEncodingTypesCanonUrl in runCleanupStorage
  so governance doc updates take effect without worker isolate recycle
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Context parameter silently dropped in encode action
    • Added fullInput merging (context appended to input) in runEncodeAction and passed it to isStructuredInput, parseStructuredInput, and parseUnstructuredInput, matching the pattern already used by runGateAction.
Preview (e7a80b4c14)
diff --git a/package-lock.json b/package-lock.json
--- a/package-lock.json
+++ b/package-lock.json
@@ -1,12 +1,12 @@
 {
   "name": "oddkit",
-  "version": "0.15.0",
+  "version": "0.16.0",
   "lockfileVersion": 3,
   "requires": true,
   "packages": {
     "": {
       "name": "oddkit",
-      "version": "0.15.0",
+      "version": "0.16.0",
       "license": "MIT",
       "dependencies": {
         "@modelcontextprotocol/sdk": "^1.0.0",

diff --git a/workers/src/orchestrate.ts b/workers/src/orchestrate.ts
--- a/workers/src/orchestrate.ts
+++ b/workers/src/orchestrate.ts
@@ -55,6 +55,26 @@
 /** Internal type — handlers return this, handleUnifiedAction stamps server_time */
 type ActionResult = Omit<OddkitEnvelope, "server_time">;
 
+// Governance-driven encoding types
+interface EncodingTypeDef {
+  letter: string;
+  name: string;
+  triggerWords: string[];
+  triggerRegex: RegExp | null;
+  qualityCriteria: Array<{ criterion: string; check: string; gapMessage: string }>;
+}
+
+interface ParsedArtifact {
+  type: string;
+  typeName: string;
+  fields: string[];
+  title: string;
+  body: string;
+}
+
+let cachedEncodingTypes: EncodingTypeDef[] | null = null;
+let cachedEncodingTypesCanonUrl: string | undefined = undefined;
+
 export interface UnifiedParams {
   action: string;
   input: string;
@@ -253,18 +273,166 @@
   return { from: "unknown", to: "unknown" };
 }
 
-function detectEncodeType(input: string): string {
-  if (/\b(decided|decision|chose|choosing|selected|committed to|going with)\b/i.test(input))
-    return "decision";
-  if (/\b(learned|insight|realized|discovered|found that|turns out)\b/i.test(input))
-    return "insight";
-  if (/\b(boundary|limit|constraint|rule|prohibition|must not|never)\b/i.test(input))
-    return "boundary";
-  if (/\b(override|exception|despite|even though|notwithstanding)\b/i.test(input))
-    return "override";
-  return "decision";
+// Discover encoding types from canon governance docs
+async function discoverEncodingTypes(
+  fetcher: ZipBaselineFetcher,
+  canonUrl?: string,
+): Promise<EncodingTypeDef[]> {
+  if (cachedEncodingTypes && cachedEncodingTypesCanonUrl === canonUrl) return cachedEncodingTypes;
+
+  const index = await fetcher.getIndex(canonUrl);
+  const typeArticles = index.entries.filter(
+    (entry: IndexEntry) => entry.tags?.includes("encoding-type") && entry.path.includes("encoding-types/"),
+  );
+
+  const types: EncodingTypeDef[] = [];
+  for (const article of typeArticles) {
+    try {
+      const content = await fetcher.getFile(article.path, canonUrl);
+      if (!content) continue;
+
+      const identityMatch = content.match(/\|\s*Letter\s*\|\s*([A-Z])\s*\|/);
+      const nameMatch = content.match(/\|\s*Name\s*\|\s*([^|]+)\s*\|/);
+      if (!identityMatch) continue;
+
+      const letter = identityMatch[1];
+      const name = nameMatch ? nameMatch[1].trim() : letter;
+
+      const triggerSection = content.match(
+        /## Trigger Words[^\n]*\n[\s\S]*?```\n([\s\S]*?)\n```/,
+      );
+      const triggerWords = triggerSection
+        ? triggerSection[1].split(",").map((w: string) => w.trim()).filter((w: string) => w.length > 0)
+        : [];
+      const triggerRegex =
+        triggerWords.length > 0
+          ? new RegExp("\\b(" + triggerWords.map((w: string) => w.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")).join("|") + ")\\b", "i")
+          : null;
+
+      const criteriaSection = content.match(
+        /## Quality Criteria[\s\S]*?\| Criterion[\s\S]*?\|[-|\s]+\|\n([\s\S]*?)(?=\n\n|\n##|$)/,
+      );
+      const qualityCriteria: Array<{ criterion: string; check: string; gapMessage: string }> = [];
+      if (criteriaSection) {
+        for (const row of criteriaSection[1].split("\n").filter((r: string) => r.includes("|"))) {
+          const cols = row.split("|").map((c: string) => c.trim()).filter((c: string) => c.length > 0);
+          if (cols.length >= 3) {
+            qualityCriteria.push({
+              criterion: cols[0],
+              check: cols[1],
+              gapMessage: cols[2].replace(/^"|"$/g, ""),
+            });
+          }
+        }
+      }
+
+      types.push({ letter, name, triggerWords, triggerRegex, qualityCriteria });
+    } catch {
+      continue;
+    }
+  }
+
+  if (types.length === 0) {
+    // Fallback OLDC+H defaults when no governance docs in canon
+    const defaults: Array<[string, string, string[]]> = [
+      ["D", "Decision", ["decided", "decision", "chose", "committed to", "going with"]],
+      ["O", "Observation", ["observed", "noticed", "found", "measured", "detected"]],
+      ["L", "Learning", ["learned", "realized", "discovered", "turns out", "insight"]],
+      ["C", "Constraint", ["must", "must not", "never", "always", "constraint", "cannot"]],
+      ["H", "Handoff", ["next session", "next step", "todo", "follow up", "blocked by"]],
+    ];
+    for (const [letter, name, words] of defaults) {
+      types.push({
+        letter, name, triggerWords: words,
+        triggerRegex: new RegExp("\\b(" + words.join("|") + ")\\b", "i"),
+        qualityCriteria: [],
+      });
+    }
+  }
+
+  cachedEncodingTypes = types;
+  cachedEncodingTypesCanonUrl = canonUrl;
+  return types;
 }
 
+function isStructuredInput(input: string): boolean {
+  const lines = input.split("\n").filter((l) => l.trim().length > 0);
+  return lines.length > 0 && lines.every((l) => /^[A-Z]\t/.test(l));
+}
+
+function parseStructuredInput(input: string, types: EncodingTypeDef[]): ParsedArtifact[] {
+  const typeMap = new Map(types.map((t) => [t.letter, t.name]));
+  return input.split("\n").filter((l) => l.trim().length > 0).map((line) => {
+    const fields = line.split("\t");
+    const letter = fields[0]?.trim() || "D";
+    return {
+      type: letter, typeName: typeMap.get(letter) || letter,
+      fields, title: fields[1]?.trim() || "", body: fields[2]?.trim() || "",
+    };
+  });
+}
+
+function parseUnstructuredInput(input: string, types: EncodingTypeDef[]): ParsedArtifact[] {
+  const paragraphs = input.split(/\n\n+/).filter((p) => p.trim().length > 0);
+  const artifacts: ParsedArtifact[] = [];
+  for (const para of paragraphs) {
+    let matched = false;
+    for (const t of types) {
+      // DESIGN: no break — a paragraph can match multiple types intentionally.
+      // "We must never deploy without tests" is both Decision and Constraint.
+      // Multi-typing at the server level mirrors what the model would do with
+      // separate TSV rows. Do not add a break here.
+      if (t.triggerRegex && t.triggerRegex.test(para)) {
+        const first = para.split(/[.!?\n]/)[0]?.trim() || para.slice(0, 60);
+        const title = first.split(/\s+/).length <= 12 ? first : first.split(/\s+/).slice(0, 8).join(" ") + "...";
+        artifacts.push({ type: t.letter, typeName: t.name, fields: [t.letter, title, para.trim()], title, body: para.trim() });
+        matched = true;
+      }
+    }
+    if (!matched) {
+      const first = para.split(/[.!?\n]/)[0]?.trim() || para.slice(0, 60);
+      const title = first.split(/\s+/).length <= 12 ? first : first.split(/\s+/).slice(0, 8).join(" ") + "...";
+      const fallback = types[0] || { letter: "D", name: "Decision" };
+      artifacts.push({ type: fallback.letter, typeName: fallback.name, fields: [fallback.letter, title, para.trim()], title, body: para.trim() });
+    }
+  }
+  return artifacts;
+}
+
+function scoreArtifactQuality(
+  artifact: ParsedArtifact,
+  criteria: Array<{ criterion: string; check: string; gapMessage: string }>,
+): { score: number; maxScore: number; level: string; gaps: string[]; suggestions: string[] } {
+  const gaps: string[] = [];
+  const suggestions: string[] = [];
+  let score = 0;
+
+  if (criteria.length === 0) {
+    if (artifact.body.split(/\s+/).length >= 10) score++;
+    else suggestions.push("Expand — more detail improves quality");
+    if (/because|due to|since/i.test(artifact.body)) score++;
+    else suggestions.push("Add rationale");
+    return { score, maxScore: 2, level: score >= 2 ? "adequate" : "weak", gaps, suggestions };
+  }
+
+  for (const c of criteria) {
+    const ck = c.check.toLowerCase();
+    let passed = false;
+    if (ck.includes("non-empty")) passed = artifact.fields.length > 3 || artifact.body.length > 0;
+    else if (ck.includes("10")) passed = artifact.body.split(/\s+/).length >= 10;
+    else if (ck.includes("number") || ck.includes("concrete")) passed = /\d/.test(artifact.body);
+    else if (ck.includes("interpretation") || ck.includes("does not contain")) passed = !/should|better|worse|means|implies/i.test(artifact.body);
+    else if (ck.includes("prohibition") || ck.includes("requirement")) passed = /must|must not|never|always|shall/i.test(artifact.body);
+    else passed = artifact.body.split(/\s+/).length >= 5;
+    if (passed) score++;
+    else { gaps.push(c.gapMessage); suggestions.push(c.gapMessage); }
+  }
+
+  const mx = criteria.length;
+  const level = score >= mx ? "strong" : score >= Math.ceil(mx * 0.6) ? "adequate" : score >= Math.ceil(mx * 0.4) ? "weak" : "insufficient";
+  return { score, maxScore: mx, level, gaps, suggestions };
+}
+
 // ──────────────────────────────────────────────────────────────────────────────
 // Score entries (legacy, kept for backward-compat in existing action handlers)
 // ──────────────────────────────────────────────────────────────────────────────
@@ -563,6 +731,8 @@
   // Also clear the in-memory BM25 index
   cachedBM25Index = null;
   cachedBM25Entries = null;
+  cachedEncodingTypes = null;
+  cachedEncodingTypesCanonUrl = undefined;
 
   return {
     action: "cleanup_storage",
@@ -1246,93 +1416,66 @@
 ): Promise<ActionResult> {
   const startMs = Date.now();
   const fullInput = context ? `${input}\n${context}` : input;
-  const encodeType = detectEncodeType(input);
 
-  const firstSentence = input.split(/[.!?\n]/)[0]?.trim() || input.slice(0, 60);
-  const title =
-    firstSentence.split(/\s+/).length <= 12
-      ? firstSentence
-      : firstSentence.split(/\s+/).slice(0, 8).join(" ") + "...";
+  const types = await discoverEncodingTypes(fetcher, canonUrl);
+  const structured = isStructuredInput(fullInput);
+  const artifacts = structured
+    ? parseStructuredInput(fullInput, types)
+    : parseUnstructuredInput(fullInput, types);
 
-  let rationale: string | null = null;
-  const rMatch =
-    fullInput.match(/because\s+(.+?)(?:\.|$)/i) || fullInput.match(/due to\s+(.+?)(?:\.|$)/i);
-  if (rMatch && rMatch[1].split(/\s+/).length >= 3) rationale = rMatch[1].trim();
+  // Score each artifact using its type's quality criteria
+  const scoredArtifacts = artifacts.map((a) => {
+    const typeDef = types.find((t) => t.letter === a.type);
+    const criteria = typeDef ? typeDef.qualityCriteria : [];
+    const quality = scoreArtifactQuality(a, criteria);
+    return { title: a.title, type: a.type, typeName: a.typeName, content: a.body, fields: a.fields, quality };
+  });
 
-  const constraints: string[] = [];
-  for (const s of fullInput.split(/[.!?\n]+/).filter((s) => s.trim().length > 5)) {
-    if (/\b(must|shall|required|always|never|constraint|cannot)\b/i.test(s))
-      constraints.push(s.trim());
-  }
-
-  let score = 0;
-  if (input.split(/\s+/).length >= 10) score++;
-  if (rationale) score++;
-  if (constraints.length > 0) score++;
-  if (/\b(alternative|instead|option|versus|vs|rather than)\b/i.test(fullInput)) score++;
-  if (/\b(irreversib|reversib|temporary|permanent|until)\b/i.test(fullInput)) score++;
-  const qualityLevel =
-    score >= 4 ? "strong" : score >= 3 ? "adequate" : score >= 2 ? "weak" : "insufficient";
-
-  const gaps: string[] = [];
-  const suggestions: string[] = [];
-  if (!rationale) {
-    gaps.push("No rationale detected — add 'because...'");
-    suggestions.push("Add explicit rationale");
-  }
-  if (constraints.length === 0)
-    suggestions.push("Add constraints: what boundaries does this create?");
-  if (encodeType === "decision" && !/\b(alternative|instead)\b/i.test(fullInput))
-    suggestions.push("Document alternatives considered");
-  if (!/\b(irreversib|reversib|temporary|permanent)\b/i.test(fullInput))
-    suggestions.push("Note whether this is reversible or permanent");
-
-  const artifact = {
-    title,
-    type: encodeType,
-    decision: input.trim(),
-    rationale: rationale || "(not provided — add 'because...' to strengthen)",
-    constraints,
-    status: qualityLevel === "strong" || qualityLevel === "adequate" ? "recorded" : "draft",
-    timestamp: new Date().toISOString(),
-  };
-
-  // Update state
+  // Update state — track all encoded type letters
   const updatedState = state ? initState(state) : undefined;
   if (updatedState) {
-    updatedState.decisions_encoded.push(title);
+    for (const a of artifacts) {
+      updatedState.decisions_encoded.push(`${a.type}:${a.title}`);
+    }
   }
 
-  const lines = [
-    `Encoded ${encodeType}: ${title}`,
-    `Status: ${artifact.status} | Quality: ${qualityLevel} (${score}/5)`,
+  // Build assistant_text as markdown with per-artifact sections
+  const lines: string[] = [
+    `## Encoded ${scoredArtifacts.length} artifact${scoredArtifacts.length !== 1 ? "s" : ""}`,
     "",
   ];
-  lines.push(`Decision: ${input.trim()}`, `Rationale: ${artifact.rationale}`, "");
-  if (constraints.length > 0) {
-    lines.push("Constraints:");
-    for (const c of constraints) lines.push(`  - ${c}`);
+  for (const a of scoredArtifacts) {
+    lines.push(`### [${a.type}] ${a.typeName}: ${a.title}`);
+    lines.push(`**Quality:** ${a.quality.level} (${a.quality.score}/${a.quality.maxScore})`);
     lines.push("");
-  }
-  if (gaps.length > 0) {
-    lines.push("Gaps:");
-    for (const g of gaps) lines.push(`  - ${g}`);
+    lines.push(a.content);
     lines.push("");
+    if (a.quality.gaps.length > 0) {
+      lines.push("**Gaps:**");
+      for (const g of a.quality.gaps) lines.push(`- ${g}`);
+      lines.push("");
+    }
+    if (a.quality.suggestions.length > 0) {
+      lines.push("**Suggestions:**");
+      for (const s of a.quality.suggestions) lines.push(`- ${s}`);
+      lines.push("");
+    }
   }
-  if (suggestions.length > 0) {
-    lines.push("Suggestions:");
-    for (const s of suggestions) lines.push(`  - ${s}`);
-    lines.push("");
+
+  lines.push("---");
+  lines.push("**Encoding types (governance):**");
+  for (const t of types) {
+    lines.push(`- **${t.letter}** — ${t.name}`);
   }
 
   return {
     action: "encode",
     result: {
       status: "ENCODED",
-      artifact,
-      quality: { level: qualityLevel, score, max_score: 5, gaps, suggestions },
+      artifacts: scoredArtifacts,
+      governance: types.map((t) => ({ letter: t.letter, name: t.name })),
       persist_required: true,
-      next_action: "Save this artifact to the project's storage (project journal, file, database). Encode does NOT persist.",
+      next_action: "Save these artifacts to storage. Encode does NOT persist.",
     },
     state: updatedState,
     assistant_text: lines.join("\n").trim(),

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 121dd07. Configure here.

Comment thread workers/src/orchestrate.ts Outdated
…Action pattern

The context parameter was accepted but never used in runEncodeAction,
silently discarding supplementary context provided by callers. This
restores the fullInput merging pattern used by runGateAction so that
context is included in type detection, artifact parsing, and quality
scoring.
@klappy klappy merged commit bef6c83 into main Apr 16, 2026
5 checks passed
klappy pushed a commit to klappy/klappy.dev that referenced this pull request Apr 16, 2026
Four governance bugs produced four code bugs in PR klappy/oddkit#96.
The server had no governance to follow for these cases and improvised.

Fixes:
1. Removed bare 'found' from observation trigger words — collided with
   Learning's 'found that' during fallback regex classification
2. Added fallback: true to observation.md frontmatter — observation
   is the canonical fallback for unmatched paragraphs
3. Added ## Fallback Behavior section to how-to-write-encoding-types.md
   — specifies how fallback type is resolved via frontmatter
4. Added ## Scoring Algorithm section to how-to-write-encoding-types.md
   — centrally defines score→level mapping (strong=max, adequate=60%+,
   weak=40%+, insufficient<40%)
5. Added ## Context vs Input section to how-to-write-encoding-types.md
   — input generates artifacts, context only informs quality scoring

Prompt over code requires complete governance. These gaps caused the
server to improvise, and bugbot caught the improvisations.
klappy added a commit to klappy/klappy.dev that referenced this pull request Apr 16, 2026
…98)

Four governance bugs produced four code bugs in PR klappy/oddkit#96.
The server had no governance to follow for these cases and improvised.

Fixes:
1. Removed bare 'found' from observation trigger words — collided with
   Learning's 'found that' during fallback regex classification
2. Added fallback: true to observation.md frontmatter — observation
   is the canonical fallback for unmatched paragraphs
3. Added ## Fallback Behavior section to how-to-write-encoding-types.md
   — specifies how fallback type is resolved via frontmatter
4. Added ## Scoring Algorithm section to how-to-write-encoding-types.md
   — centrally defines score→level mapping (strong=max, adequate=60%+,
   weak=40%+, insufficient<40%)
5. Added ## Context vs Input section to how-to-write-encoding-types.md
   — input generates artifacts, context only informs quality scoring

Prompt over code requires complete governance. These gaps caused the
server to improvise, and bugbot caught the improvisations.

Co-authored-by: Claude (oddkit project) <chris@klapp.dev>
klappy pushed a commit that referenced this pull request Apr 16, 2026
Server now follows klappy://odd/encoding-types/how-to-write-encoding-types
section 'Context vs Input': input generates artifacts; context only
informs quality scoring.

Before this change, fullInput (input + context) was passed to both
the parser and the scorer, causing context paragraphs to become
separate standalone artifacts. The governance says context is
metadata, not content.

Changes:
- runEncodeAction: parsers receive input only (not fullInput)
- runEncodeAction: scoring receives input + context per artifact so
  background information still counts toward quality
- scoreArtifactQuality: accepts optional scoringText parameter that
  defaults to artifact.body when not provided
- Inline comments cite the governance doc to prevent regression

Closes the gap between governance and code surfaced during PR #96
testing.
klappy added a commit that referenced this pull request Apr 17, 2026
Mirrors the PR #96 encode pattern. Extracts challenge behavior from
live governance articles (landed in klappy.dev canon via PR #99)
rather than hardcoded source logic.

New functions in workers/src/orchestrate.ts:
- discoverChallengeTypes — per-canonUrl cached type discovery
- fetchBasePrerequisites — universal prerequisite checks
- fetchNormativeVocabulary — RFC 2119 + architectural load-bearing terms
- fetchStakesCalibration — mode-to-depth filter
- extractPrereqTable / extractKeywordsFromCheck — shared helpers

Refactored:
- runChallengeAction — replaces hardcoded detectClaimType /
  generateChallenges / findTensions / findMissingPrerequisites
  with governance extraction. Supports multi-match. Filters output
  by stakes calibration based on mode parameter.
- runCleanupStorage — clears all four new caches on invalidation

Invariant: voice-dump mode suppresses all challenge output
regardless of matched types. Load-bearing per stakes-calibration
governance — some modes exist for raw capture and pressure-testing
at that stage damages the mode.

Graceful degradation: missing governance articles fall back to
minimal built-in behavior with warnings, rather than failing.

Co-authored-by: Claude <noreply@anthropic.com>
klappy added a commit that referenced this pull request Apr 17, 2026
Refactor runChallengeAction in workers/src/orchestrate.ts to extract
challenge-type behavior from canon governance articles at runtime rather
than hardcoding claim-type detection, questions, prerequisites, and
tension rules in source. Structural mirror of PR #96 (encode).

Detection upgraded mid-implementation from regex-OR to BM25 + stemming
after the gauntlet revealed that regex-based matching was morphologically
brittle ("coin" doesn't match trigger "coining"). The pivot removed an
entire class of bug and seeded a reusable pattern for future
governance-driven tools.

Changes in workers/src/orchestrate.ts:
- New: ChallengeTypeDef, BasePrerequisite, NormativeVocabulary,
  StakesModeConfig, StakesCalibration
- New: discoverChallengeTypes (builds per-canonUrl BM25 index over
  detection text), fetchBasePrerequisites, fetchNormativeVocabulary,
  fetchStakesCalibration — each with per-canonUrl cache and graceful
  degradation on missing articles
- New: evaluatePrerequisiteCheck — interprets natural-language check
  strings from prerequisite overlay tables
- Refactored runChallengeAction: multi-match via BM25 score > 0, base
  + overlay prerequisite aggregation, stakes calibration filtering,
  voice-dump suppression invariant, governance-driven tension detection
- Extended runCleanupStorage with five new cache clears (types,
  type-index, base prerequisites, vocabulary, calibration)
- Removed dead detectClaimType (legacy src/tasks/challenge.js retains
  its copy for CLI backward-compat)
- Added CHALLENGE_STOP_WORDS set preserving modal verbs as signal

Changes in workers/src/bm25.ts (backward-compatible extension):
- tokenize(), buildBM25Index() accept optional stopWords: Set<string>
- BM25Index gains optional stopWords field so searchBM25 tokenizes
  queries consistently with the index
- Default behavior unchanged — existing callers unaffected
- Motivation: default STOP_WORDS filters modals (must, should, shall,
  may, not) which are signal for challenge-type detection

New tests: workers/test/governance-parser.test.mjs — 94 assertions
against live governance articles fetched from klappy.dev raw. Covers
type parsing, fallback resolution, BM25 detection, stemming regression
cases (coin/coining, propose/proposed, principle/principles), multi-
match, and the voice-dump suppression invariant. 94/94 pass.

Bugs the gauntlet caught on this PR:
1. Voice-dump suppression invariant would have shipped broken — the
   calibration cell reads "none (suppress all challenge)" not bare
   "none". Strict-equality parser would have produced a single-element
   array, voice-dump mode would have surfaced all challenges in prod.
2. Morphological brittleness in regex detection (coin vs coining) —
   triggered the pivot to BM25 + stemming.
3. Default BM25 STOP_WORDS silently breaks strong-claim and proposal
   detection by filtering modal verbs. Fixed via custom stop word set.

Verification:
- npm run typecheck: clean
- tests/smoke.sh: 6/6 pass (legacy CLI path — backward compat preserved)
- workers/test/governance-parser.test.mjs: 94/94 pass
- AI voice clichés audit on new comments: clean
- oddkit_preflight, challenge, gate, validate: all run; gate NOT_READY
  due to same hardcoded-logic gap as challenge pre-refactor (flagged as
  follow-up)

Response shape change: adds mode, matched_types, type_definitions,
block_until_addressed; removes claim_type. Consumed programmatically,
not rendered.

Follow-ups flagged:
- Encode parity PR — same regex-OR brittleness in runEncodeAction;
  pattern proven here, port will be near-mechanical
- klappy.dev meta governance PR — "compiles into a case-insensitive
  word-boundary regex" is now stale language
- Gate refactor candidate — same hardcoded-logic shape as challenge pre-refactor

Refs:
- Depends on: klappy/klappy.dev#99 (governance articles this code reads)
- Structural mirror: #96 (governance-driven encode)
- Evidence: docs/oddkit/evidence/challenge-governance-code-refactor.md
klappy added a commit that referenced this pull request Apr 19, 2026
Retrofits oddkit_encode to the envelope contract established by the telemetry_policy canary (canon/constraints/core-governance-baseline) and adds the DOLCHEO vocabulary features that postdate encode's original canary refactor (PR #96). Two-tier cascade — encoding-types are canon-only per the baseline contract, not required-baseline. Version bump to 0.18.0 MINOR (additive).

Branch-preview smoke: 61/61 pass. Sonnet 4.6 validator: VERIFIED, 11/11 checks, 3 non-blocking advisories.

Ref: klappy://odd/handoffs/2026-04-20-p1-2-encode-canary
Ref: klappy://canon/definitions/dolcheo-vocabulary
Ref: klappy://canon/constraints/core-governance-baseline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants