feat: route browser telemetry directly to the VM by default#109
Merged
Conversation
|
Created a monitoring plan for this PR. What this PR does: Fixes the browser telemetry stream so it calls the correct API path ( Intended effect:
Risks:
Status updates will be posted automatically on this PR as monitoring progresses. |
Add "telemetry" to the default KERNEL_BROWSER_ROUTING_SUBRESOURCES
allowlist so the telemetry SSE stream is routed directly to the browser
VM out of the box, and change the telemetry stream method path from
/browsers/{id}/telemetry to /browsers/{id}/telemetry/stream so that the
direct-VM path rewrite yields {base_url}/telemetry/stream (the VM's SSE
endpoint; {base_url}/telemetry is a different, non-stream endpoint).
This DEPENDS ON the control-plane PR renaming the public endpoint
/browsers/{id}/telemetry -> /browsers/{id}/telemetry/stream. Until that
deploys, the non-routed path 404s in prod, so telemetry.stream works only
via direct routing today (which also independently proves routing works).
Verified with a live smoke test against prod: created a telemetry-enabled
headless browser, opened the telemetry stream, generated VM API activity
via browsers.curl, observed telemetry events, and confirmed the stream
request was rewritten to the VM (host proxy.*:8443, path
/telemetry/stream, Authorization stripped, jwt query param appended) and
never hit api.onkernel.com.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
eventStreamDecoder.Next dispatched an event on every blank line, including after a comment-only block such as the server's ":\n\n" keepalive (sent every ~15s on an idle stream, and on the initial idle window). That produced an event with an empty Data buffer, which Stream[T].Next then fed to json.Unmarshal -> "unexpected end of JSON input", ending the stream. This made idle browser telemetry streams die. Skip blocks that have no event type and no data so keepalive comments are ignored. Adds regression tests for a keepalive following an event and a standalone keepalive. Also simplifies examples/browser-telemetry: with the decoder fixed it no longer needs a background goroutine generating continuous activity to dodge the keepalive, so it now mirrors the node/python telemetry examples. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ac1265d to
59bd952
Compare
archandatta
approved these changes
Jun 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
/browsers/{id}/telemetryto/browsers/{id}/telemetry/stream(browsertelemetry.go). The direct-VM path rewrite is a pure passthrough, so this now yields{base_url}/telemetry/stream— the VM's SSE stream endpoint. ({base_url}/telemetryon the VM is a different, non-stream endpoint that returns JSON config.)telemetryto the defaultKERNEL_BROWSER_ROUTING_SUBRESOURCESallowlist (browser_routing.go), so the telemetry SSE stream is routed directly to the browser VM with no env var configuration.[curl, telemetry], and addedTestBrowserRoutingRewritesTelemetryStreamToVMproving a telemetry call is rewritten to{base_url}/telemetry/streamwith theAuthorizationheader stripped and ajwtquery param appended. Updatedapi.mdtitle toget /browsers/{id}/telemetry/stream.examples/browser-telemetry-routing-smoke— a live smoke program.Dependency
DEPENDS ON the control-plane PR renaming the public endpoint
/browsers/{id}/telemetry->/browsers/{id}/telemetry/stream. That rename is not yet deployed to prod, so a non-routed call to the new SDK path 404s today; the direct-routed call (rewritten on the VM) already works. This is exactly why the live smoke exercises the direct-routed path — it also independently proves routing is doing the work.Live smoke evidence (prod)
Created a telemetry-enabled headless browser (
timeout_seconds=120,telemetry.enabled=true), openedclient.Browsers.Telemetry.StreamStreaming, generated activity viabrowsers.curl, and asserted both event delivery and routing. Ran 3/3 green. Example:Asserted: the recorded stream URL host matches the session
base_urlhost (proxy.*:8443), path ends with/telemetry/stream, it is NOTapi.onkernel.com,Authorizationis stripped, and ajwtquery param is present.Note observed during smoke
The generated
ssestream.Stream[T]wrapper surfaces SSE keepalive comment frames (: ...) as an "unexpected end of JSON input" error and terminates (it tries tojson.Unmarshalan empty data buffer). The smoke example reopens the stream on that error and keeps reading until a data frame arrives, so it is reliable. This is a pre-existing, codegen-owned robustness issue independent of routing and is out of scope for this PR.🤖 Generated with Claude Code
Note
Medium Risk
Changes default request routing and JWT/header handling for telemetry streams; mis-routing could break streaming until the control-plane path rename is deployed everywhere.
Overview
Browser telemetry is now routed directly to the VM by default, alongside
curl, via theKERNEL_BROWSER_ROUTING_SUBRESOURCESdefault allowlist—no extra env configuration. Integration coverage confirms telemetry SSE calls are rewritten to{base_url}/telemetry/streamwith JWT in the query and Authorization stripped.The SSE client (
packages/ssestream) ignores empty comment-only blocks (:\n\nkeepalives) so idle telemetry streams no longer die with JSON decode errors; regression tests and aexamples/browser-telemetrysample were added.Note: Depends on the control-plane rename to
/browsers/{id}/telemetry/streamfor non-routed API calls; direct VM routing already targets the stream endpoint.Reviewed by Cursor Bugbot for commit 59bd952. Bugbot is set up for automated code reviews on this repo. Configure here.
Update — idle-stream SSE keepalive fix now included in this PR
The earlier note that the SSE keepalive issue was out of scope is superseded: this PR now fixes it.
packages/ssestream/ssestream.gono longer dispatches empty comment-only blocks (the server's:\n\nkeepalive, sent every ~15s on an idle stream and on the initial idle window), which previously surfaced asunexpected end of JSON inputand killed idle telemetry streams. Added regression tests (packages/ssestream/ssestream_test.go) and simplifiedexamples/browser-telemetryto drop the background-activity workaround so it mirrors the node/python examples.Note:
packages/ssestream/ssestream.gois Stainless codegen-owned; this is a local patch pending an upstream fix in the generator config.