feat(telemetry_policy): canary refactor — headers from canon at runtime by klappy · Pull Request #106 · klappy/oddkit

klappy · 2026-04-18T01:23:34Z

Canary refactor — the governance anti-pattern sweep begins here

Replaces the hardcoded self_report_headers dictionary in telemetry_policy with a runtime parse of the ### Self-Report Fields table in canon/constraints/telemetry-governance.md. Response envelope now declares governance_source: "canon" | "minimal".

Why this one first

This is the lowest-stakes instance of the Vodka anti-pattern cataloged in docs/oddkit/audit/governance-anti-pattern-sweep-2026-04-17.md (PR #105). The goal is to prove the refactor template — single feature PR, single promotion PR, canon-change-no-redeploy preview smoke — before applying it to higher-stakes tools (validate, orient, gate, encode).

Scope is deliberately tight: one tool, one helper, ~80 lines of diff. If this takes more than 2 PRs to ship, the sweep pauses and we diagnose the template.

Contract conformance

Implements the three-tier resolution stack from klappy/klappy.dev#101 (canon/constraints/core-governance-baseline):

Tier 1 (canon): getFile("canon/constraints/telemetry-governance.md") → parseSelfReportHeadersTable → governance_source: "canon"
Tier 2 (bundled baseline): deferred — the manifest + baseline directory + build-time schema check arrive in follow-up work once the contract graduates from status: draft to status: active
Tier 3 (minimal): 8 stable headers hardcoded as the floor. governance_source: "minimal" signals degradation to callers.

The companion canon doc ships tier:1 status:draft and uses this refactor as its first validation case.

Verification

npm run typecheck clean
Parser unit-tested against live canon content: 8/8 headers parsed
Parser degradation paths (no section, empty table) return null
Preview smoke against Cloudflare preview env with canon-change-no-redeploy (run after merge to main)
Prod smoke after promotion PR

Public contract — additive, not breaking

self_report_headers keys unchanged (still the same 8 x-oddkit-* headers)
self_report_headers values change slightly — canon's terser descriptions now win when the canon tier succeeds, which is the intended behavior per canon/principles/dry-canon-says-it-once
New field governance_source added to the result object

Refactor discipline (from PR #100 post-mortem)

Single site touched (workers/src/index.ts)
No forward-fixes planned — if prod smoke fails, revert within 15 min and reopen
Promotion PR follows only after preview smoke passes
Audit row for telemetry_policy gets stamped when this lands

Overview
telemetry_policy now derives self_report_headers from canon at runtime by parsing the ### Self-Report Fields markdown table in canon/constraints/telemetry-governance.md, and adds governance_source (canon|minimal) to explicitly signal whether canon parsing succeeded or the minimal fallback was used.

Introduces a shared parseTableRow helper in markdown-utils.ts (reused by orchestrate.ts) and extends the governance parser test to validate the self-report headers extraction and degradation behavior.

^{Reviewed by Cursor Bugbot for commit db1936d. Bugbot is set up for automated code reviews on this repo. Configure here.}

Replaces the hardcoded self_report_headers dictionary with a runtime parse of canon/constraints/telemetry-governance.md #### Self-Report Fields table. Response envelope now declares governance_source: 'canon' when the fetch succeeds and the table parses, 'minimal' when it falls back to the shipped baseline. This is the canary refactor for the governance anti-pattern sweep (docs/oddkit/audit/governance-anti-pattern-sweep-2026-04-17). It conforms to the three-tier resolution contract drafted in klappy/klappy.dev#101 (canon/constraints/core-governance-baseline), exercising tiers 1 (live canon) and 3 (minimal baseline in code). Tier 2 (bundled baseline directory with manifest) and the build-time schema check arrive in follow-up work once the contract graduates from status:draft to status:active. Implementation: - New helper parseSelfReportHeadersTable in index.ts parses the '### Self-Report Fields' table section from the canon doc. - Parser is permissive (whitespace + backticks) and fails closed to null so the caller falls back to the minimal baseline rather than hiding the degradation. - Minimal baseline remains the 8 stable headers; canon controls the descriptions once live. Verified: - npm run typecheck: clean - Parser unit-tested against live canon content: 8/8 headers parsed - Parser degradation paths (no section, empty table) return null Refactor discipline this commit follows (from PR #100 post-mortem): - Single feature PR, single site touched - Public contract (MCP tool response) changes are additive (governance_source field added; self_report_headers keys unchanged) - Preview smoke against live prod will verify canon-tier response before promotion

cloudflare-workers-and-pages · 2026-04-18T01:23:42Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	oddkit	`db1936d`	Commit Preview URL Branch Preview URL	Apr 18 2026, 10:21 PM

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Duplicated parseTableRow risks silent drift between copies
- Extracted parseTableRow into a new shared workers/src/markdown-utils.ts module and updated both index.ts and orchestrate.ts to import from it, eliminating the duplicated implementations; typecheck passes.

Preview (7b2d68bcfb)

diff --git a/workers/src/index.ts b/workers/src/index.ts
--- a/workers/src/index.ts
+++ b/workers/src/index.ts
@@ -22,6 +22,7 @@
 import { RequestTracer } from "./tracing";
 import { parseConsumerLabel } from "./telemetry";
 import { renderNotFoundPage } from "./not-found-ui";
+import { parseTableRow } from "./markdown-utils";
 import pkg from "../package.json";
 
 export type { Env };
@@ -29,6 +30,40 @@
 const BUILD_VERSION = pkg.version;
 
 // ──────────────────────────────────────────────────────────────────────────────
+// Canon-table parsing helper.
+//
+// parseSelfReportHeadersTable extracts the self-report header contract from
+// canon/constraints/telemetry-governance.md. The table format is governed by
+// the canon doc itself; this parser is deliberately permissive (whitespace,
+// backticks around header name) and fails closed to null so the caller can
+// fall back to the minimal baseline without hiding the degradation.
+// ──────────────────────────────────────────────────────────────────────────────
+
+function parseSelfReportHeadersTable(markdown: string): Record<string, string> | null {
+  // Target section: "### Self-Report Fields" — grab the table that follows.
+  // Stop at the next `###` or `##` heading, whichever comes first.
+  const section = markdown.match(
+    /###\s+Self-Report Fields[^\n]*\n([\s\S]*?)(?=\n###|\n##|$)/,
+  );
+  if (!section) return null;
+
+  const headers: Record<string, string> = {};
+  for (const raw of section[1].split("\n")) {
+    if (!raw.includes("|")) continue;
+    const cols = parseTableRow(raw);
+    // Expected layout: | Field | Header | Source |
+    // Skip header row, separator row, and any malformed row.
+    if (cols.length < 2) continue;
+    const fieldDescription = cols[0];
+    const headerName = cols[1].replace(/`/g, "").trim();
+    if (!headerName.startsWith("x-oddkit-")) continue; // skip header/separator
+    headers[headerName] = fieldDescription;
+  }
+
+  return Object.keys(headers).length > 0 ? headers : null;
+}
+
+// ──────────────────────────────────────────────────────────────────────────────
 // Consumer identification nudge
 //
 // DO NOT add session caching, sticky identification, or per-session memory.
@@ -451,7 +486,7 @@
 
   server.tool(
     "telemetry_policy",
-    "Return oddkit telemetry and sharing policy guidance. What is tracked, what is excluded, and why. Fetched from canonical governance document at runtime.",
+    "Return oddkit telemetry and sharing policy guidance. What is tracked, what is excluded, and why. Fetched from canonical governance document at runtime. Response envelope declares governance_source (canon|baseline|minimal) per canon/constraints/core-governance-baseline.",
     {},
     {
       readOnlyHint: true,
@@ -460,17 +495,52 @@
       openWorldHint: true,
     },
     async () => {
-      // Fetch the governance doc from canon
+      // Governance resolution per canon/constraints/core-governance-baseline:
+      //   1. Live canon fetch (preferred) → governance_source: "canon"
+      //   2. Minimal baseline (shipped in code) → governance_source: "minimal"
+      //
+      // This canary refactor implements tiers 1 and 3 only. The bundled
+      // baseline tier (2) and the build-time schema check arrive in follow-up
+      // work; the manifest + baseline directory are not yet in place.
       const fetcher = new ZipBaselineFetcher(env);
-      let policyContent = "Governance document not found. See https://github.com/klappy/klappy.dev/blob/main/canon/constraints/telemetry-governance.md";
+      let policyContent: string | null = null;
+      let selfReportHeaders: Record<string, string> | null = null;
+      let governanceSource: "canon" | "baseline" | "minimal" = "minimal";
 
       try {
         const content = await fetcher.getFile("canon/constraints/telemetry-governance.md");
-        if (content) policyContent = content;
+        if (content) {
+          policyContent = content;
+          const parsed = parseSelfReportHeadersTable(content);
+          if (parsed && Object.keys(parsed).length > 0) {
+            selfReportHeaders = parsed;
+            governanceSource = "canon";
+          }
+        }
       } catch {
-        // Fall through to default message
+        // Fall through to minimal tier below
       }
 
+      if (governanceSource === "minimal") {
+        // Minimal baseline — the tool remains useful when canon is unreachable
+        // or the table cannot be parsed. These eight headers are the stable
+        // self-report contract; if canon adds a 9th, the "canon" tier delivers
+        // it and this list stays as the floor.
+        selfReportHeaders = {
+          "x-oddkit-client": "Your client name (highest priority identifier)",
+          "x-oddkit-client-version": "Your client version",
+          "x-oddkit-agent-name": "The AI agent name",
+          "x-oddkit-agent-version": "The AI agent version",
+          "x-oddkit-surface": "Where this is running (e.g. claude.ai, vscode)",
+          "x-oddkit-contact-url": "URL for your project or org",
+          "x-oddkit-policy-url": "Your privacy/telemetry policy URL",
+          "x-oddkit-capabilities": "Comma-separated capability list",
+        };
+        if (!policyContent) {
+          policyContent = "Governance document not reachable. See https://github.com/klappy/klappy.dev/blob/main/canon/constraints/telemetry-governance.md";
+        }
+      }
+
       return {
         content: [{
           type: "text" as const,
@@ -479,16 +549,8 @@
             result: {
               policy: policyContent,
               governance_uri: "klappy://canon/constraints/telemetry-governance",
-              self_report_headers: {
-                "x-oddkit-client": "Your client name (highest priority identifier)",
-                "x-oddkit-client-version": "Your client version",
-                "x-oddkit-agent-name": "The AI agent name",
-                "x-oddkit-agent-version": "The AI agent version",
-                "x-oddkit-surface": "Where this is running (e.g. claude.ai, vscode)",
-                "x-oddkit-contact-url": "URL for your project or org",
-                "x-oddkit-policy-url": "Your privacy/telemetry policy URL",
-                "x-oddkit-capabilities": "Comma-separated capability list",
-              },
+              governance_source: governanceSource,
+              self_report_headers: selfReportHeaders,
               generated_at: new Date().toISOString(),
             },
           }, null, 2),

diff --git a/workers/src/markdown-utils.ts b/workers/src/markdown-utils.ts
new file mode 100644
--- /dev/null
+++ b/workers/src/markdown-utils.ts
@@ -1,0 +1,24 @@
+/**
+ * Shared markdown parsing helpers.
+ *
+ * Keep this module dependency-free so it can be imported from any code path
+ * (orchestrate, index, future canon readers) without pulling in unrelated
+ * state. Every helper here must be pure and stateless.
+ */
+
+/**
+ * Parse a single markdown table row into trimmed cell values, preserving
+ * legitimately-empty middle cells. Only the leading and trailing empty strings
+ * produced by splitting a `| a | b |`-style row are stripped — a prior
+ * `.filter(c => c.length > 0)` approach also dropped empty interior cells,
+ * which silently collapsed the column count and caused `cols.length >= N`
+ * guards to misfire (e.g. a voice-dump row with an empty tiers cell).
+ */
+export function parseTableRow(row: string): string[] {
+  const parts = row.split("|");
+  // Strip the leading empty produced by a leading `|`, if present
+  if (parts.length > 0 && parts[0].trim() === "") parts.shift();
+  // Strip the trailing empty produced by a trailing `|`, if present
+  if (parts.length > 0 && parts[parts.length - 1].trim() === "") parts.pop();
+  return parts.map((c) => c.trim());
+}

diff --git a/workers/src/orchestrate.ts b/workers/src/orchestrate.ts
--- a/workers/src/orchestrate.ts
+++ b/workers/src/orchestrate.ts
@@ -18,6 +18,7 @@
   type SectionResult,
 } from "./zip-baseline-fetcher";
 import { buildBM25Index, searchBM25, type BM25Index } from "./bm25";
+import { parseTableRow } from "./markdown-utils";
 import type { RequestTracer } from "./tracing";
 import pkg from "../package.json";
 
@@ -155,27 +156,6 @@
 }
 
 // ──────────────────────────────────────────────────────────────────────────────
-// Markdown table helpers
-// ──────────────────────────────────────────────────────────────────────────────
-
-/**
- * Parse a single markdown table row into trimmed cell values, preserving
- * legitimately-empty middle cells. Only the leading and trailing empty strings
- * produced by splitting a `| a | b |`-style row are stripped — a prior
- * `.filter(c => c.length > 0)` approach also dropped empty interior cells,
- * which silently collapsed the column count and caused `cols.length >= N`
- * guards to misfire (e.g. a voice-dump row with an empty tiers cell).
- */
-function parseTableRow(row: string): string[] {
-  const parts = row.split("|");
-  // Strip the leading empty produced by a leading `|`, if present
-  if (parts.length > 0 && parts[0].trim() === "") parts.shift();
-  // Strip the trailing empty produced by a trailing `|`, if present
-  if (parts.length > 0 && parts[parts.length - 1].trim() === "") parts.pop();
-  return parts.map((c) => c.trim());
-}
-
-// ──────────────────────────────────────────────────────────────────────────────
 // BM25 Index Cache (per-request, lazy)
 // ──────────────────────────────────────────────────────────────────────────────

_{You can send follow-ups to the cloud agent here.}

^{Reviewed by Cursor Bugbot for commit af861a7. Configure here.}

The canary refactor (klappy/oddkit#106) reads this table at runtime to populate telemetry_policy's self_report_headers response field. The prior 3-column schema (Field, Header, Source) gave callers only short labels ('Client name'); the hardcoded dictionary in the worker had richer per-field guidance ('Your client name (highest priority identifier)') that was lost when canon became the source of truth. Extending the table to 4 columns lets canon carry the authoritative per-header description. This is a DRY consolidation: the guidance lives in one place (here), the worker parses it at runtime, and updates flow through canon rather than code. Examples added per field (client names, surface identifiers, capability list format) so the table doubles as implementer docs for consumers wiring up their own telemetry headers. Related: klappy/oddkit#106 (canary refactor), #101 (core-governance-baseline contract draft).

Addresses execution-mode challenge gaps on PR #106: 1. Information regression fixed: parser now reads canon's column 4 (Description) instead of column 0 (Field label). Canon was extended with richer per-header descriptions in klappy/klappy.dev#102; this commit updates the parser to consume that new column. 2. Tests committed: Test 8 added to workers/test/governance-parser.test.mjs covering 8/8 header extraction, non-trivial description lengths, and degradation paths (no section, empty table). All 105 tests pass against the unmerged klappy.dev branch via KLAPPYDEV_RAW override. Still outstanding (follow-up work, not blocking the canary): - parseTableRow duplicated across workers/src/index.ts and workers/src/orchestrate.ts. Accepted duplication for now, flagged in both sites; export-and-share refactor lands when the sweep surfaces more duplication candidates. - Preview smoke against Cloudflare preview with the extended canon loaded but no worker redeploy — run manually after this PR deploys. Companion PR: klappy/klappy.dev#102 (canon extension). This worker change is backward-compatible with the old 3-column table: the parser requires 4 cols, so against the old canon it falls through to the minimal baseline tier. Once klappy.dev#102 merges, canon tier takes over.

…102) The canary refactor (klappy/oddkit#106) reads this table at runtime to populate telemetry_policy's self_report_headers response field. The prior 3-column schema (Field, Header, Source) gave callers only short labels ('Client name'); the hardcoded dictionary in the worker had richer per-field guidance ('Your client name (highest priority identifier)') that was lost when canon became the source of truth. Extending the table to 4 columns lets canon carry the authoritative per-header description. This is a DRY consolidation: the guidance lives in one place (here), the worker parses it at runtime, and updates flow through canon rather than code. Examples added per field (client names, surface identifiers, capability list format) so the table doubles as implementer docs for consumers wiring up their own telemetry headers. Related: klappy/oddkit#106 (canary refactor), #101 (core-governance-baseline contract draft).

Promote PR #106: telemetry_policy canary — headers from canon at runtime

…moke Addresses three gaps found in live validation of canary PR #106: 1. Response envelope was missing server_time, assistant_text, debug. Every other oddkit tool returns {action, result, server_time, assistant_text, debug}; telemetry_policy returned only {action, result}. This breaks the time-discipline contract — project instructions require every oddkit response to carry server_time so models have a clock reading on every call. 2. canon_url parameter was silently ignored. The Zod schema was {}, so MCP stripped canon_url before the handler saw it, and the handler hardcoded the default baseline. The three-tier resolution contract in canon/constraints/core-governance-baseline assumes every canon-driven tool accepts canon_url for overrides — this is load-bearing for TruthKit / custom-canon consumers. 3. No live-smoke test for the envelope shape. Parser tests in governance-parser.test.mjs exercised parser logic only. The canary shipped with partial contract conformance because no test invoked the MCP tool end-to-end and asserted the envelope shape. Changes: - Add canon_url to the tool's Zod schema; thread through to fetcher.getFile(path, canon_url). - Expand response envelope to match convention: server_time, assistant_text (human-readable summary naming the tier), debug with duration_ms and canon_url echo. - New workers/test/canon-tool-envelope.smoke.mjs — live smoke script that curls the MCP endpoint and verifies envelope shape for oddkit_time (convention baseline), telemetry_policy default (canon tier), and telemetry_policy with canon_url override (minimal fallback). Verified: - npm run typecheck: clean - Smoke script structure matches PR #100's governance-parser test style and exits non-zero on any envelope violation. Lesson for the sweep: every canon-driven refactor must verify both the new governance_source signal AND full envelope conformance. The canary's partial completion was caught by live validation but should have been caught by pre-merge smoke. Follow-up to update the refactor template in docs/oddkit/audit/... separately.

Live validation of telemetry_policy canary (klappy/oddkit#106) against prod surfaced three gaps the original contract didn't name explicitly enough: 1. Response envelope shape is part of the contract. A tool that returns {action, result} but omits server_time/assistant_text/debug breaks the time-discipline system even if governance_source is present. Added as Runtime Invariant #3. 2. canon_url parameter must be in the Zod schema, not just documented as a concept. MCP silently strips unknown parameters. The canary shipped with schema={} and canon_url was unreachable. Added as Runtime Invariant #4. 3. Live-smoke against the MCP endpoint is a ship-blocker, not a nice-to-have. Internal parser tests passed while the tool shipped with broken envelope and silent param stripping. Added as Runtime Invariant #7, template referenced. Refactor Implications section expanded to a 7-point checklist and acknowledges the canary's partial completion + follow-up PR as the first documented test of the contract. Follow-up PR that closes the canary gaps: klappy/oddkit#108.

cursor Bot reviewed Apr 18, 2026

View reviewed changes

Comment thread workers/src/index.ts Outdated

refactor(workers): extract parseTableRow to shared markdown-utils

7b2d68b

klappy merged commit aae64f9 into main Apr 18, 2026
5 checks passed

klappy mentioned this pull request Apr 18, 2026

Promote PR #106: telemetry_policy canary — headers from canon at runtime #107

Merged

klappy added a commit that referenced this pull request Apr 18, 2026

Promote PR #106: telemetry_policy canary to prod

204ad1e

Promote PR #106: telemetry_policy canary — headers from canon at runtime

klappy mentioned this pull request Apr 19, 2026

fix(telemetry_policy): canary completeness + knowledge_base rename + live smoke #108

Merged

4 tasks

klappy mentioned this pull request Apr 20, 2026

feat(challenge): governance_source envelope + peer governance_uris (0.19.0) #116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(telemetry_policy): canary refactor — headers from canon at runtime#106

feat(telemetry_policy): canary refactor — headers from canon at runtime#106
klappy merged 3 commits intomainfrom
feat/canary-telemetry-policy-canon-headers

klappy commented Apr 18, 2026 •

edited by cursor Bot

Loading

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 18, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

klappy commented Apr 18, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Canary refactor — the governance anti-pattern sweep begins here

Why this one first

Contract conformance

Verification

Public contract — additive, not breaking

Refactor discipline (from PR #100 post-mortem)

Related

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

klappy commented Apr 18, 2026 •

edited by cursor Bot

Loading

cloudflare-workers-and-pages Bot commented Apr 18, 2026 •

edited

Loading

cursor Bot left a comment •

edited

Loading