chore(release): 0.24.0 — payload-shape telemetry by klappy · Pull Request #135 · klappy/oddkit

klappy · 2026-04-23T21:26:50Z

Release: 0.24.0 (minor)

Carries the payload-shape telemetry feature from feat/telemetry-tokenization plus the version bump. Branch is based on feat/telemetry-tokenization HEAD so all 9 commits ride along — when this lands, the feature branch can be closed.

Bumps

File	Before	After
`package.json`	0.23.1	0.24.0
`workers/package.json`	0.23.1	0.24.0
`package-lock.json`	0.23.0 ⚠	0.24.0
`workers/package-lock.json`	0.23.1	0.24.0

⚠ Root package-lock.json had drifted one release behind (0.23.0 while workers was at 0.23.1) — back-filled here. Both lockfiles still require manual sync per current tooling; the pre-commit hook only enforces sync between the two package.json files.

What's in this release

The full CHANGELOG entry is on the diff. Headline items:

Added: bytes_in, bytes_out, tokens_in, tokens_out telemetry doubles via gpt-tokenizer/encoding/cl100k_base. Module-level lazy singleton, ~432 KB gzipped, ~6× faster than @anthropic-ai/tokenizer per the in-tree bench. All measurement happens in ctx.waitUntil so user-facing latency is unchanged.
Changed: No Content-Type filter on the response body — MCP's Streamable HTTP transport returns text/event-stream, not application/json, and the original filter caused 100% of tool_call responses to record bytes_out=0.
Removed: tokenize_ms (formerly double7). Cloudflare Workers freezes both performance.now() and Date.now() between network I/O events as a timing-side-channel mitigation, making sub-request timing of pure-CPU tokenization structurally unmeasurable. The bench at workers/test/tokenize.test.mjs characterized the cost curve; future per-call cost is predictable from observed bytes_out / tokens_out against that curve.
Fixed: Root package-lock.json drift back-fill.

Full schema after this release:

blobs:   [event_type, method, tool_name, consumer_label, consumer_source,
          knowledge_base_url, document_uri, worker_version, cache_tier]    // 9
doubles: [count, duration_ms, bytes_in, bytes_out, tokens_in, tokens_out]  // 6
indexes: [consumer_label]                                                   // 1

Validation

7/7 unit tests pass (workers/test/tokenize.test.mjs)
6/6 integration tests pass (workers/test/telemetry-integration.test.mjs)
Typecheck clean (reports as oddkit-mcp-worker@0.24.0)
Live preview smoke PASS — fifth Managed Agent run (sesn_011CaMNujMg9pymcz18JFPp8) confirmed all four shape fields populate with realistic varied values across distinct tools (oddkit_catalog: 21,437 bytes_out / 5,856 tokens_out; oddkit_time: 178 bytes_out / 71 tokens_out). MAX(double7) = 0 confirms tokenize_ms cleanly absent.

Workers-runtime forensics

Four distinct Workers ≠ Node behavioral diffs surfaced and resolved on this branch, each caught by live smoke (none by unit tests). Listed in the CHANGELOG Refs trailer with the corresponding agent session IDs.

Companion PR (canon)

klappy/klappy.dev#134 — telemetry-governance schema update + two new constraints (measure-before-you-object, performed-prudence-anti-pattern). Suggested merge order: that one first (governance lands, telemetry_policy reflects new schema immediately), then this one.

Sequencing options

Merge this PR directly → all 10 commits (9 feature + 1 release) land on main in one squash. feat/telemetry-tokenization becomes redundant and can be closed.
Merge feat/telemetry-tokenization (feat(telemetry): add bytes_in/out, tokens_in/out, tokenize_ms via gpt-tokenizer #134) first, then rebase this branch → release PR becomes a clean 1-commit diff (just package.json + lockfiles + CHANGELOG).

Either works.

Note

Medium Risk
Adds new telemetry measurement and schema fields (bytes/tokens) to the production Workers MCP handler and introduces a new tokenizer dependency, which could affect runtime performance/memory and Analytics Engine dashboards despite being deferred via waitUntil. Risk is mitigated by defensive try/catch, synchronous response cloning, and new unit/integration tests covering the write path.

Overview
Adds payload-shape telemetry to the Workers MCP server: each request now records bytes_in, bytes_out, tokens_in, and tokens_out as new doubles (double3–double6) using a lazily loaded gpt-tokenizer cl100k encoder (workers/src/tokenize.ts), executed in ctx.waitUntil to avoid impacting response latency.

Updates the telemetry write path to accept the already-read request body string, attach the new payload metrics to every written data point (including batch JSON-RPC), and drops the previously attempted tokenize_ms field; response measurement no longer gates on Content-Type so SSE/tool-call responses are included.

Bumps versions to 0.24.0 (root + workers) and syncs both lockfiles; adds unit + integration tests validating tokenizer behavior and end-to-end telemetry schema/writes.

^{Reviewed by Cursor Bugbot for commit d023ad6. Bugbot is set up for automated code reviews on this repo. Configure here.}

…-tokenizer Adds payload-shape instrumentation to MCP telemetry. New doubles 3-7 capture wire size and cl100k_base token counts for every request and response, plus the wall-clock cost of tokenization itself. Implementation: - New module workers/src/tokenize.ts wraps gpt-tokenizer/encoding/cl100k_base with a lazy-loaded singleton encoder and a safe-failure surface (countTokensSafe, measurePayloadShape). Module-level promise caches the encoder across requests within a worker isolate; cold path pays parse once, all subsequent calls are warm. - Refactors workers/src/telemetry.ts recordTelemetry signature to accept a pre-read body string + optional PayloadShape rather than reading the request body itself. Schema doc comment expanded to describe doubles 3-7. Synchronous now (no longer returns a Promise) since the callers measurement work happens in waitUntil. - Updates workers/src/index.ts call site: clones the response (when Content-Type is application/json), reads request and response bodies in the waitUntil background task, calls measurePayloadShape, then recordTelemetry. Zero user-facing latency added — measurement happens after the response is sent. SSE responses skip body measurement. Tokenizer choice: - gpt-tokenizer/encoding/cl100k_base over @anthropic-ai/tokenizer. Empirical bench (Node v22, same V8 as Workers): cl100k median 0.05-1.3ms across 200B-50KB payloads vs 0.30-7.4ms for Anthropic WASM. p95 dramatically better (no WASM memory-grow spikes). - Token count diverges ~3-4% from Claude tokenizer on English prose; acceptable noise floor for shape analysis (we are not billing). - Bundle delta measured empirically via esbuild: 432KB gzipped (993KB minified). Comfortably within paid-tier Workers limits. Failure handling: - Any tokenizer load or encode failure → countTokensSafe returns null, treated as 0 in telemetry. tokenize_ms = 0 alongside non-zero bytes signals a measurement skip in the data. - Telemetry must never break MCP requests — all measurement code wrapped in try/catch within the waitUntil block. Tests: - New workers/test/tokenize.test.mjs (8 cases, all pass): empty input, positive integer output, scaling with length, full PayloadShape contract, UTF-8 byte length correctness, JSON-RPC payload tokenization, tokenize_ms finiteness, empty-response (SSE) skip path. - Compiles tokenize.ts via tsc into a temp dir, then dynamic-imports; exercises the same TypeScript surface that ships in the worker bundle. - npm run typecheck clean. Methodology note: - This change exists because three theoretical objections (bundle bloat, vodka violation, tokenizer-choice domain opinion) were falsified by a five-minute bench. See klappy://canon/constraints/measure-before-you-object and klappy://canon/observations/performed-prudence-anti-pattern (drafts pending merge into klappy.dev).

Mocks env.ODDKIT_TELEMETRY with a writeDataPoint capture, then exercises recordTelemetry + measurePayloadShape with realistic JSON-RPC payloads. Verifies end-to-end that the full PayloadShape lands in doubles 3-7, that bytes match TextEncoder UTF-8 length, that batch JSON-RPC produces one point per message, and that malformed input is silently dropped. 7/7 cases pass. Notable: the realistic ~8KB response measured tokenize_ms=0.948ms — within 14% of the bench prediction (~1.1ms median for 8KB on Node). The dream-home walkthrough was accurate; real prod will differ but the order of magnitude is locked. Compiles tokenize.ts + telemetry.ts via tsc into a temp dir, post-patches the JSON import to add Node 22's required attribute syntax, then dynamic-imports. Same code path that ships in the worker bundle. This is the verification that wrangler dev would have done if workerd ran in this nested sandbox (it doesn't — workerd dies after declaring ready, likely a Linux capability issue with the container).

…ing invariant

Two assertions that would have failed against the pre-fix code: 1. SSE response now asserts tokenize_ms=0 (was: only checked bytes_out/tokens_out, missed the spurious non-zero tokenize_ms that the original logic would record on every SSE response). 2. New test 'Bugbot invariant: tokenize_ms is 0 only when encoder did not actually run' explicitly covers the both-empty case (must be 0) and the request-only case (must be valid finite number). Both new assertions verify Bugbot's distinction: a 0 from countTokensSafe on empty input is a trivial short-circuit, not a real tokenization. Only non-null results on non-empty input prove the encoder ran. The pre-fix code conflated these and would have polluted the bench-vs-prod A/B comparison with spurious tokenize_ms readings on SSE traffic. Real-world tokenize_ms on the realistic 8KB integration test: 1.016ms (bench predicted 1.1ms — within 8%). 8/8 cases passing.

… JSON CRITICAL FIX. A managed-agent smoke test against the preview deployment caught that doubles 4 (bytes_out), 6 (tokens_out), and 7 (tokenize_ms) were all zero across every recorded data point. Six telemetry rows queried, six rows with bytes_out=0. Root cause: the call site in workers/src/index.ts filtered the response clone by Content-Type, only cloning when the type included 'application/json'. MCP's Streamable HTTP transport returns 'text/event-stream' (SSE) for tool calls, not JSON. The filter was silently dropping almost every response, leaving responseClone null and recording zeros for the entire response side. This was the same performed-prudence pattern the new canon docs warn about, applied in micro: I assumed MCP responses would be JSON without measuring what the SDK actually returns. The smoke test caught it because canon also prescribes verification before declaring done. Fix: 1. New helper measureResponseShape(requestText, response) in tokenize.ts. Clones the response, reads the body, runs measurePayloadShape. No Content-Type filter — read everything. SSE protocol overhead (~10 bytes per event) is negligible against the actual payload size, and oddkit's responses are bounded (no long-lived streams). 2. Call site in index.ts simplified to use the helper. Drops the filter, drops the separate clone, drops the responseClone variable. Cleaner code AND correct behavior. 3. Four new unit tests for measureResponseShape: - measures application/json responses - measures text/event-stream responses (this would have caught the bug pre-merge) - leaves the original response body intact (clone correctness) - handles already-consumed body without throwing 12/12 unit tests pass, typecheck clean. Methodology note: this fix exists because the smoke test (live MCP calls + telemetry_public SQL) caught what unit tests missed. The canon-prescribed verification gate worked exactly as designed — release-validation-gate (E0008.3) at klappy://canon/constraints/release-validation-gate mandates independent live smoke for load-bearing surface changes before merge. The agent dispatch is that smoke.

…Workers Third smoke confirmed bytes_in/out and tokens_in/out now populate correctly (357-21319 bytes_out, 142-5398 tokens_out across varied payloads). But double7 (tokenize_ms) is still 0 across every row. Root cause: Cloudflare Workers' performance.now() is a deterministic timer — it does NOT advance during synchronous CPU work. The mitigation prevents timing-side-channel attacks. The timer only ticks on I/O. Tokenization (countTokensSafe) is pure CPU work. The encoder runs between two reads of performance.now() with no I/O in between, so both reads return the same value and tokenize_ms is always 0. Tests passed in Node because Node's performance.now() is a real high-resolution timer. Fix: switch to Date.now(). Always advances, at 1ms resolution. The bench-vs-prod comparison loses sub-millisecond precision (sub-ms tokenizations round to 0) but gains a working signal for any payload above ~5KB where bench timing exceeded 1ms. Updated the telemetry.ts schema doc comment to document the 1ms resolution and the Workers-specific reason. Methodology: this is the third Cloudflare Workers gotcha caught in prod that unit tests can't catch — Workers Runtime != Node: 1. b94aaa6 (mine): assumed MCP responses are application/json (they're SSE) 2. 1a555df (mine): assumed clone() inside waitUntil works (body already drained) 3. THIS: assumed performance.now() advances in synchronous code (it doesn't) Each was caught by the live Managed Agent smoke + telemetry_public SQL, not by typecheck or unit tests. The release-validation-gate is the only thing standing between this branch and a quietly broken prod telemetry pipeline. 8 unit tests still pass. Typecheck clean.

Fourth smoke confirmed bytes_in/out and tokens_in/out work in production (357-21319 bytes_out, 142-5398 tokens_out across varied payload sizes). But tokenize_ms remained 0 across every row even with the Date.now() fix from 279f761. Root cause discovered by the agent: Cloudflare Workers freezes BOTH performance.now() AND Date.now() during synchronous CPU work. Both timers only advance on network I/O events as a side-channel mitigation (documented at developers.cloudflare.com/workers/runtime-apis/web-standards/). Tokenization is pure CPU work, so any sub-request timing of it always reads 0 in production. This is a structural runtime constraint, not a bug we can patch. Workarounds considered and rejected: - Force artificial I/O between reads (KV.list, fetch) — adds real latency to telemetry-only paths, grotesque - Two writeDataPoint calls with start/end timestamps — over-engineered, doubles write count, complicates queries - Keep the column as always-0 — actively misleading Decision: drop tokenize_ms entirely from PayloadShape, the doubles array, schema doc, and tests. The bench at workers/test/tokenize.test.mjs already characterized the cost curve (cl100k handles 50 KB in ~1.3 ms on Node v22). Bytes_out + tokens_out are sufficient signal — a future maintainer can predict tokenize_ms from the bench curve given the observed payload sizes. Schema before: doubles: [count, duration_ms, bytes_in, bytes_out, tokens_in, tokens_out, tokenize_ms] // 7 fields Schema after: doubles: [count, duration_ms, bytes_in, bytes_out, tokens_in, tokens_out] // 6 fields Companion canon update at klappy/klappy.dev coming in next commit on that branch — drops tokenize_ms row from the doubles table and removes the tokenize_ms mention in 'What This Enables'. Methodology: this is the fourth Workers Runtime != Node behavioral diff caught by live smoke on this branch. Each was unmeasurable from unit tests because Node behaves differently: 1. b94aaa6 (mine, broken): Content-Type filter (MCP returns SSE) 2. 1a555df (mine, broken): clone in waitUntil (body already drained) 3. 279f761 (mine, broken): Date.now() in Workers (frozen too) 4. THIS: drop the unmeasurable column entirely The release-validation-gate canon doc is the only thing that surfaced each of these — the live preview smoke + telemetry_public SQL caught what no test setup I could ship would have caught. The Workers-runtime gap was real and the gate worked. Tests: - 7/7 unit tests pass (workers/test/tokenize.test.mjs) - 6/6 integration tests pass (workers/test/telemetry-integration.test.mjs) - typecheck clean

Minor bump for payload-shape telemetry (PR #134). Bumps: package.json 0.23.1 -> 0.24.0 workers/package.json 0.23.1 -> 0.24.0 package-lock.json 0.23.0 -> 0.24.0 (root drifted one release behind) workers/package-lock.json 0.23.1 -> 0.24.0 CHANGELOG.md gains the [0.24.0] entry above [0.23.1] documenting: - Added: bytes_in/out, tokens_in/out telemetry doubles + helpers - Changed: drop the Content-Type filter (MCP responses are SSE) - Removed: tokenize_ms — Workers freezes both perf.now and Date.now - Fixed: root package-lock.json version drift back-fill The four Workers Runtime != Node behavioral diffs caught by the five Managed Agent smoke sessions on this branch are listed in the Refs trailer for forensic record. Tests: 7/7 unit + 6/6 integration pass on bumped state. Typecheck clean (reports as oddkit-mcp-worker@0.24.0). Per workflow: dedicated chore/release-x.y.z PR. Branch is off feat/telemetry-tokenization HEAD, so it carries the feature commits + the bump together. After merge, feat/telemetry-tokenization can be closed (its commits are already in main via this release branch).

cloudflare-workers-and-pages · 2026-04-23T21:26:55Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	oddkit	`d023ad6`	Commit Preview URL Branch Preview URL	Apr 23 2026, 09:30 PM

klappy · 2026-04-23T21:29:48Z

Closing — bump consolidated onto #134 (commit d023ad6 is now on feat/telemetry-tokenization HEAD). One PR per feature, version bump rides along. Sorry for the duplication.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.}

^{Reviewed by Cursor Bugbot for commit d023ad6. Configure here.}

cursor · 2026-04-23T21:34:04Z

        const cacheTier = tracer.indexSource;
+        // Clone the response synchronously before returning so the body is
+        // still available to read inside the deferred waitUntil callback.
+        const responseClone = response.clone();


Unprotected response.clone() can break MCP responses

Medium Severity

The response.clone() call sits outside any try/catch, while the ctx.waitUntil callback's catch block (line 991–993) explicitly upholds the invariant "Telemetry must never break MCP requests." If clone() throws (e.g., the SDK returns a response with an already-disturbed or locked body), the exception prevents return response from ever executing, turning a telemetry-only code path into a user-facing 500 error. The old code had no response.clone() at all, so this is a new risk. Moving the clone inside the existing try/catch (or wrapping it in its own) would preserve the stated safety guarantee.

Additional Locations (1)

workers/src/index.ts#L990-L993

^{Reviewed by Cursor Bugbot for commit d023ad6. Configure here.}

Claude (drafting for klappy) and others added 12 commits April 23, 2026 19:01

fix(tokenize): zero tokenize_ms when neither payload tokenized

c4f5752

fix(telemetry): guard response.clone in try/catch to uphold non-break…

6b8dac4

…ing invariant

fix(telemetry): clone response before waitUntil to preserve body

cf52c18

telemetry: avoid double response clone in waitUntil

b17e7fb

Remove unused measureResponseShape helper and its tests

8c91ceb

klappy closed this Apr 23, 2026

klappy deleted the chore/release-0.24.0 branch April 23, 2026 21:29

cursor Bot reviewed Apr 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(release): 0.24.0 — payload-shape telemetry#135

chore(release): 0.24.0 — payload-shape telemetry#135
klappy wants to merge 12 commits intomainfrom
chore/release-0.24.0

klappy commented Apr 23, 2026 •

edited by cursor Bot

Loading

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

klappy commented Apr 23, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

klappy commented Apr 23, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release: 0.24.0 (minor)

Bumps

What's in this release

Validation

Workers-runtime forensics

Companion PR (canon)

Sequencing options

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

klappy commented Apr 23, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 23, 2026

Choose a reason for hiding this comment

Unprotected response.clone() can break MCP responses

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

klappy commented Apr 23, 2026 •

edited by cursor Bot

Loading

cloudflare-workers-and-pages Bot commented Apr 23, 2026 •

edited

Loading

Unprotected `response.clone()` can break MCP responses