feat(telemetry_policy): canary refactor — headers from canon at runtime#106
Merged
feat(telemetry_policy): canary refactor — headers from canon at runtime#106
Conversation
Replaces the hardcoded self_report_headers dictionary with a runtime parse of canon/constraints/telemetry-governance.md #### Self-Report Fields table. Response envelope now declares governance_source: 'canon' when the fetch succeeds and the table parses, 'minimal' when it falls back to the shipped baseline. This is the canary refactor for the governance anti-pattern sweep (docs/oddkit/audit/governance-anti-pattern-sweep-2026-04-17). It conforms to the three-tier resolution contract drafted in klappy/klappy.dev#101 (canon/constraints/core-governance-baseline), exercising tiers 1 (live canon) and 3 (minimal baseline in code). Tier 2 (bundled baseline directory with manifest) and the build-time schema check arrive in follow-up work once the contract graduates from status:draft to status:active. Implementation: - New helper parseSelfReportHeadersTable in index.ts parses the '### Self-Report Fields' table section from the canon doc. - Parser is permissive (whitespace + backticks) and fails closed to null so the caller falls back to the minimal baseline rather than hiding the degradation. - Minimal baseline remains the 8 stable headers; canon controls the descriptions once live. Verified: - npm run typecheck: clean - Parser unit-tested against live canon content: 8/8 headers parsed - Parser degradation paths (no section, empty table) return null Refactor discipline this commit follows (from PR #100 post-mortem): - Single feature PR, single site touched - Public contract (MCP tool response) changes are additive (governance_source field added; self_report_headers keys unchanged) - Preview smoke against live prod will verify canon-tier response before promotion
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
oddkit | db1936d | Commit Preview URL Branch Preview URL |
Apr 18 2026, 10:21 PM |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Duplicated
parseTableRowrisks silent drift between copies- Extracted
parseTableRowinto a new sharedworkers/src/markdown-utils.tsmodule and updated bothindex.tsandorchestrate.tsto import from it, eliminating the duplicated implementations; typecheck passes.
- Extracted
Preview (7b2d68bcfb)
diff --git a/workers/src/index.ts b/workers/src/index.ts
--- a/workers/src/index.ts
+++ b/workers/src/index.ts
@@ -22,6 +22,7 @@
import { RequestTracer } from "./tracing";
import { parseConsumerLabel } from "./telemetry";
import { renderNotFoundPage } from "./not-found-ui";
+import { parseTableRow } from "./markdown-utils";
import pkg from "../package.json";
export type { Env };
@@ -29,6 +30,40 @@
const BUILD_VERSION = pkg.version;
// ──────────────────────────────────────────────────────────────────────────────
+// Canon-table parsing helper.
+//
+// parseSelfReportHeadersTable extracts the self-report header contract from
+// canon/constraints/telemetry-governance.md. The table format is governed by
+// the canon doc itself; this parser is deliberately permissive (whitespace,
+// backticks around header name) and fails closed to null so the caller can
+// fall back to the minimal baseline without hiding the degradation.
+// ──────────────────────────────────────────────────────────────────────────────
+
+function parseSelfReportHeadersTable(markdown: string): Record<string, string> | null {
+ // Target section: "### Self-Report Fields" — grab the table that follows.
+ // Stop at the next `###` or `##` heading, whichever comes first.
+ const section = markdown.match(
+ /###\s+Self-Report Fields[^\n]*\n([\s\S]*?)(?=\n###|\n##|$)/,
+ );
+ if (!section) return null;
+
+ const headers: Record<string, string> = {};
+ for (const raw of section[1].split("\n")) {
+ if (!raw.includes("|")) continue;
+ const cols = parseTableRow(raw);
+ // Expected layout: | Field | Header | Source |
+ // Skip header row, separator row, and any malformed row.
+ if (cols.length < 2) continue;
+ const fieldDescription = cols[0];
+ const headerName = cols[1].replace(/`/g, "").trim();
+ if (!headerName.startsWith("x-oddkit-")) continue; // skip header/separator
+ headers[headerName] = fieldDescription;
+ }
+
+ return Object.keys(headers).length > 0 ? headers : null;
+}
+
+// ──────────────────────────────────────────────────────────────────────────────
// Consumer identification nudge
//
// DO NOT add session caching, sticky identification, or per-session memory.
@@ -451,7 +486,7 @@
server.tool(
"telemetry_policy",
- "Return oddkit telemetry and sharing policy guidance. What is tracked, what is excluded, and why. Fetched from canonical governance document at runtime.",
+ "Return oddkit telemetry and sharing policy guidance. What is tracked, what is excluded, and why. Fetched from canonical governance document at runtime. Response envelope declares governance_source (canon|baseline|minimal) per canon/constraints/core-governance-baseline.",
{},
{
readOnlyHint: true,
@@ -460,17 +495,52 @@
openWorldHint: true,
},
async () => {
- // Fetch the governance doc from canon
+ // Governance resolution per canon/constraints/core-governance-baseline:
+ // 1. Live canon fetch (preferred) → governance_source: "canon"
+ // 2. Minimal baseline (shipped in code) → governance_source: "minimal"
+ //
+ // This canary refactor implements tiers 1 and 3 only. The bundled
+ // baseline tier (2) and the build-time schema check arrive in follow-up
+ // work; the manifest + baseline directory are not yet in place.
const fetcher = new ZipBaselineFetcher(env);
- let policyContent = "Governance document not found. See https://github.com/klappy/klappy.dev/blob/main/canon/constraints/telemetry-governance.md";
+ let policyContent: string | null = null;
+ let selfReportHeaders: Record<string, string> | null = null;
+ let governanceSource: "canon" | "baseline" | "minimal" = "minimal";
try {
const content = await fetcher.getFile("canon/constraints/telemetry-governance.md");
- if (content) policyContent = content;
+ if (content) {
+ policyContent = content;
+ const parsed = parseSelfReportHeadersTable(content);
+ if (parsed && Object.keys(parsed).length > 0) {
+ selfReportHeaders = parsed;
+ governanceSource = "canon";
+ }
+ }
} catch {
- // Fall through to default message
+ // Fall through to minimal tier below
}
+ if (governanceSource === "minimal") {
+ // Minimal baseline — the tool remains useful when canon is unreachable
+ // or the table cannot be parsed. These eight headers are the stable
+ // self-report contract; if canon adds a 9th, the "canon" tier delivers
+ // it and this list stays as the floor.
+ selfReportHeaders = {
+ "x-oddkit-client": "Your client name (highest priority identifier)",
+ "x-oddkit-client-version": "Your client version",
+ "x-oddkit-agent-name": "The AI agent name",
+ "x-oddkit-agent-version": "The AI agent version",
+ "x-oddkit-surface": "Where this is running (e.g. claude.ai, vscode)",
+ "x-oddkit-contact-url": "URL for your project or org",
+ "x-oddkit-policy-url": "Your privacy/telemetry policy URL",
+ "x-oddkit-capabilities": "Comma-separated capability list",
+ };
+ if (!policyContent) {
+ policyContent = "Governance document not reachable. See https://github.com/klappy/klappy.dev/blob/main/canon/constraints/telemetry-governance.md";
+ }
+ }
+
return {
content: [{
type: "text" as const,
@@ -479,16 +549,8 @@
result: {
policy: policyContent,
governance_uri: "klappy://canon/constraints/telemetry-governance",
- self_report_headers: {
- "x-oddkit-client": "Your client name (highest priority identifier)",
- "x-oddkit-client-version": "Your client version",
- "x-oddkit-agent-name": "The AI agent name",
- "x-oddkit-agent-version": "The AI agent version",
- "x-oddkit-surface": "Where this is running (e.g. claude.ai, vscode)",
- "x-oddkit-contact-url": "URL for your project or org",
- "x-oddkit-policy-url": "Your privacy/telemetry policy URL",
- "x-oddkit-capabilities": "Comma-separated capability list",
- },
+ governance_source: governanceSource,
+ self_report_headers: selfReportHeaders,
generated_at: new Date().toISOString(),
},
}, null, 2),
diff --git a/workers/src/markdown-utils.ts b/workers/src/markdown-utils.ts
new file mode 100644
--- /dev/null
+++ b/workers/src/markdown-utils.ts
@@ -1,0 +1,24 @@
+/**
+ * Shared markdown parsing helpers.
+ *
+ * Keep this module dependency-free so it can be imported from any code path
+ * (orchestrate, index, future canon readers) without pulling in unrelated
+ * state. Every helper here must be pure and stateless.
+ */
+
+/**
+ * Parse a single markdown table row into trimmed cell values, preserving
+ * legitimately-empty middle cells. Only the leading and trailing empty strings
+ * produced by splitting a `| a | b |`-style row are stripped — a prior
+ * `.filter(c => c.length > 0)` approach also dropped empty interior cells,
+ * which silently collapsed the column count and caused `cols.length >= N`
+ * guards to misfire (e.g. a voice-dump row with an empty tiers cell).
+ */
+export function parseTableRow(row: string): string[] {
+ const parts = row.split("|");
+ // Strip the leading empty produced by a leading `|`, if present
+ if (parts.length > 0 && parts[0].trim() === "") parts.shift();
+ // Strip the trailing empty produced by a trailing `|`, if present
+ if (parts.length > 0 && parts[parts.length - 1].trim() === "") parts.pop();
+ return parts.map((c) => c.trim());
+}
diff --git a/workers/src/orchestrate.ts b/workers/src/orchestrate.ts
--- a/workers/src/orchestrate.ts
+++ b/workers/src/orchestrate.ts
@@ -18,6 +18,7 @@
type SectionResult,
} from "./zip-baseline-fetcher";
import { buildBM25Index, searchBM25, type BM25Index } from "./bm25";
+import { parseTableRow } from "./markdown-utils";
import type { RequestTracer } from "./tracing";
import pkg from "../package.json";
@@ -155,27 +156,6 @@
}
// ──────────────────────────────────────────────────────────────────────────────
-// Markdown table helpers
-// ──────────────────────────────────────────────────────────────────────────────
-
-/**
- * Parse a single markdown table row into trimmed cell values, preserving
- * legitimately-empty middle cells. Only the leading and trailing empty strings
- * produced by splitting a `| a | b |`-style row are stripped — a prior
- * `.filter(c => c.length > 0)` approach also dropped empty interior cells,
- * which silently collapsed the column count and caused `cols.length >= N`
- * guards to misfire (e.g. a voice-dump row with an empty tiers cell).
- */
-function parseTableRow(row: string): string[] {
- const parts = row.split("|");
- // Strip the leading empty produced by a leading `|`, if present
- if (parts.length > 0 && parts[0].trim() === "") parts.shift();
- // Strip the trailing empty produced by a trailing `|`, if present
- if (parts.length > 0 && parts[parts.length - 1].trim() === "") parts.pop();
- return parts.map((c) => c.trim());
-}
-
-// ──────────────────────────────────────────────────────────────────────────────
// BM25 Index Cache (per-request, lazy)
// ──────────────────────────────────────────────────────────────────────────────You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit af861a7. Configure here.
klappy
added a commit
to klappy/klappy.dev
that referenced
this pull request
Apr 18, 2026
The canary refactor (klappy/oddkit#106) reads this table at runtime to populate telemetry_policy's self_report_headers response field. The prior 3-column schema (Field, Header, Source) gave callers only short labels ('Client name'); the hardcoded dictionary in the worker had richer per-field guidance ('Your client name (highest priority identifier)') that was lost when canon became the source of truth. Extending the table to 4 columns lets canon carry the authoritative per-header description. This is a DRY consolidation: the guidance lives in one place (here), the worker parses it at runtime, and updates flow through canon rather than code. Examples added per field (client names, surface identifiers, capability list format) so the table doubles as implementer docs for consumers wiring up their own telemetry headers. Related: klappy/oddkit#106 (canary refactor), #101 (core-governance-baseline contract draft).
Addresses execution-mode challenge gaps on PR #106: 1. Information regression fixed: parser now reads canon's column 4 (Description) instead of column 0 (Field label). Canon was extended with richer per-header descriptions in klappy/klappy.dev#102; this commit updates the parser to consume that new column. 2. Tests committed: Test 8 added to workers/test/governance-parser.test.mjs covering 8/8 header extraction, non-trivial description lengths, and degradation paths (no section, empty table). All 105 tests pass against the unmerged klappy.dev branch via KLAPPYDEV_RAW override. Still outstanding (follow-up work, not blocking the canary): - parseTableRow duplicated across workers/src/index.ts and workers/src/orchestrate.ts. Accepted duplication for now, flagged in both sites; export-and-share refactor lands when the sweep surfaces more duplication candidates. - Preview smoke against Cloudflare preview with the extended canon loaded but no worker redeploy — run manually after this PR deploys. Companion PR: klappy/klappy.dev#102 (canon extension). This worker change is backward-compatible with the old 3-column table: the parser requires 4 cols, so against the old canon it falls through to the minimal baseline tier. Once klappy.dev#102 merges, canon tier takes over.
klappy
added a commit
to klappy/klappy.dev
that referenced
this pull request
Apr 18, 2026
…102) The canary refactor (klappy/oddkit#106) reads this table at runtime to populate telemetry_policy's self_report_headers response field. The prior 3-column schema (Field, Header, Source) gave callers only short labels ('Client name'); the hardcoded dictionary in the worker had richer per-field guidance ('Your client name (highest priority identifier)') that was lost when canon became the source of truth. Extending the table to 4 columns lets canon carry the authoritative per-header description. This is a DRY consolidation: the guidance lives in one place (here), the worker parses it at runtime, and updates flow through canon rather than code. Examples added per field (client names, surface identifiers, capability list format) so the table doubles as implementer docs for consumers wiring up their own telemetry headers. Related: klappy/oddkit#106 (canary refactor), #101 (core-governance-baseline contract draft).
klappy
added a commit
that referenced
this pull request
Apr 18, 2026
Promote PR #106: telemetry_policy canary — headers from canon at runtime
klappy
added a commit
that referenced
this pull request
Apr 19, 2026
…moke Addresses three gaps found in live validation of canary PR #106: 1. Response envelope was missing server_time, assistant_text, debug. Every other oddkit tool returns {action, result, server_time, assistant_text, debug}; telemetry_policy returned only {action, result}. This breaks the time-discipline contract — project instructions require every oddkit response to carry server_time so models have a clock reading on every call. 2. canon_url parameter was silently ignored. The Zod schema was {}, so MCP stripped canon_url before the handler saw it, and the handler hardcoded the default baseline. The three-tier resolution contract in canon/constraints/core-governance-baseline assumes every canon-driven tool accepts canon_url for overrides — this is load-bearing for TruthKit / custom-canon consumers. 3. No live-smoke test for the envelope shape. Parser tests in governance-parser.test.mjs exercised parser logic only. The canary shipped with partial contract conformance because no test invoked the MCP tool end-to-end and asserted the envelope shape. Changes: - Add canon_url to the tool's Zod schema; thread through to fetcher.getFile(path, canon_url). - Expand response envelope to match convention: server_time, assistant_text (human-readable summary naming the tier), debug with duration_ms and canon_url echo. - New workers/test/canon-tool-envelope.smoke.mjs — live smoke script that curls the MCP endpoint and verifies envelope shape for oddkit_time (convention baseline), telemetry_policy default (canon tier), and telemetry_policy with canon_url override (minimal fallback). Verified: - npm run typecheck: clean - Smoke script structure matches PR #100's governance-parser test style and exits non-zero on any envelope violation. Lesson for the sweep: every canon-driven refactor must verify both the new governance_source signal AND full envelope conformance. The canary's partial completion was caught by live validation but should have been caught by pre-merge smoke. Follow-up to update the refactor template in docs/oddkit/audit/... separately.
4 tasks
klappy
added a commit
to klappy/klappy.dev
that referenced
this pull request
Apr 19, 2026
Live validation of telemetry_policy canary (klappy/oddkit#106) against prod surfaced three gaps the original contract didn't name explicitly enough: 1. Response envelope shape is part of the contract. A tool that returns {action, result} but omits server_time/assistant_text/debug breaks the time-discipline system even if governance_source is present. Added as Runtime Invariant #3. 2. canon_url parameter must be in the Zod schema, not just documented as a concept. MCP silently strips unknown parameters. The canary shipped with schema={} and canon_url was unreachable. Added as Runtime Invariant #4. 3. Live-smoke against the MCP endpoint is a ship-blocker, not a nice-to-have. Internal parser tests passed while the tool shipped with broken envelope and silent param stripping. Added as Runtime Invariant #7, template referenced. Refactor Implications section expanded to a 7-point checklist and acknowledges the canary's partial completion + follow-up PR as the first documented test of the contract. Follow-up PR that closes the canary gaps: klappy/oddkit#108.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Canary refactor — the governance anti-pattern sweep begins here
Replaces the hardcoded
self_report_headersdictionary intelemetry_policywith a runtime parse of the### Self-Report Fieldstable incanon/constraints/telemetry-governance.md. Response envelope now declaresgovernance_source: "canon" | "minimal".Why this one first
This is the lowest-stakes instance of the Vodka anti-pattern cataloged in
docs/oddkit/audit/governance-anti-pattern-sweep-2026-04-17.md(PR #105). The goal is to prove the refactor template — single feature PR, single promotion PR, canon-change-no-redeploy preview smoke — before applying it to higher-stakes tools (validate, orient, gate, encode).Scope is deliberately tight: one tool, one helper, ~80 lines of diff. If this takes more than 2 PRs to ship, the sweep pauses and we diagnose the template.
Contract conformance
Implements the three-tier resolution stack from
klappy/klappy.dev#101(canon/constraints/core-governance-baseline):getFile("canon/constraints/telemetry-governance.md")→parseSelfReportHeadersTable→governance_source: "canon"status: drafttostatus: activegovernance_source: "minimal"signals degradation to callers.The companion canon doc ships tier:1 status:draft and uses this refactor as its first validation case.
Verification
npm run typecheckcleanPublic contract — additive, not breaking
self_report_headerskeys unchanged (still the same 8x-oddkit-*headers)self_report_headersvalues change slightly — canon's terser descriptions now win when the canon tier succeeds, which is the intended behavior percanon/principles/dry-canon-says-it-oncegovernance_sourceadded to the result objectRefactor discipline (from PR #100 post-mortem)
workers/src/index.ts)Related
klappy/oddkit#105klappy/klappy.dev#101klappy/oddkit#100,#101,#102,#103,#104Note
Medium Risk
Adds runtime parsing of a governance markdown table to drive
telemetry_policyoutput and introduces a newgovernance_sourcefield; failures fall back to a minimal hardcoded baseline, but format drift or parsing bugs could change returned header descriptions.Overview
telemetry_policynow derivesself_report_headersfrom canon at runtime by parsing the### Self-Report Fieldsmarkdown table incanon/constraints/telemetry-governance.md, and addsgovernance_source(canon|minimal) to explicitly signal whether canon parsing succeeded or the minimal fallback was used.Introduces a shared
parseTableRowhelper inmarkdown-utils.ts(reused byorchestrate.ts) and extends the governance parser test to validate the self-report headers extraction and degradation behavior.Reviewed by Cursor Bugbot for commit db1936d. Bugbot is set up for automated code reviews on this repo. Configure here.