feat: route browser telemetry directly to the VM by default#108
Merged
Conversation
|
Created a monitoring plan for this PR. What this PR does: Fixes the browser telemetry stream so it actually works — the SDK was calling the wrong API path, and now also routes telemetry requests directly to the browser VM (bypassing the central API) for lower latency. Intended effect:
Risks:
Status updates will be posted automatically on this PR as monitoring progresses. |
Telemetry is now a default direct-to-VM routing subresource. The
telemetry stream method path is changed from /browsers/{id}/telemetry to
/browsers/{id}/telemetry/stream so it mirrors the browser VM endpoint:
when the request is rewritten for direct routing it yields
{base_url}/telemetry/stream, which is the SSE stream on the VM (the VM's
/telemetry is a different, non-stream JSON endpoint).
"telemetry" is added to the default KERNEL_BROWSER_ROUTING_SUBRESOURCES
allowlist alongside "curl".
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The SSE decoder's empty-block guard included last_event_id, which is sticky across events per the SSE spec. Once any event carried an id, every subsequent comment-only block (e.g. the server's ":\n\n" keepalive, sent every 15s on an idle stream) fell through the guard and dispatched an empty-data event. The typed Stream wrapper then calls .json() on it unconditionally, raising JSONDecodeError and killing the stream. This made idle browser telemetry streams crash after ~15s. Drop last_event_id from the guard so dispatch depends only on the current block's event/data/retry fields. Event-typed empty-data frames still dispatch (unchanged). Adds a regression test for a keepalive comment following an id-bearing event. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ad8d267 to
a0ee2b2
Compare
archandatta
approved these changes
Jun 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
GET /browsers/{id}/telemetrytoGET /browsers/{id}/telemetry/stream(both sync and async) so it mirrors the browser VM endpoint. When the request is rewritten for direct-to-VM routing, the pure path passthrough yields{base_url}/telemetry/stream— the SSE stream on the VM. (The VM's{base_url}/telemetryis a different, non-stream JSON config endpoint, so the/streamsuffix is required for direct routing to hit the stream.)telemetryadded to the defaultKERNEL_BROWSER_ROUTING_SUBRESOURCESallowlist alongsidecurl, so telemetry streams route directly to the VM with no env var configuration.api.mdand the routing unit tests; added a unit test asserting atelemetry.stream(...)call rewrites to{base_url}/telemetry/streamwith the jwt query param appended and theAuthorizationheader stripped.Dependency / rollout note
This DEPENDS ON the control-plane PR renaming
/browsers/{id}/telemetry->/browsers/{id}/telemetry/stream. That rename is not yet deployed to prod. Until it deploys, a non-routed call to the new SDK path 404s in prod, while a direct-routed call (rewritten to{base_url}/telemetry/streamon the VM) works today. Direct routing is the default for telemetry, so the common path works now; this also independently proves the rewrite is doing the work.Live smoke evidence (prod)
Created a headless browser with telemetry enabled, opened
browsers.telemetry.stream(...), generated activity viabrowsers.curl(...), and instrumented the httpx layer to record the final outbound request:category="api",type="api_call",operation_id="BrowserCurl")proxy.jfk-focused-mirzakhani.onkernel.com:8443(the session base_url host) — NOTapi.onkernel.com/telemetry/streamAuthorizationheader stripped;jwtquery param presentfinallyTests
uv run pytest tests/test_browser_routing.py— 21 passeduv run ruff checkon changed paths — cleanuv run python -c "import kernel"— ok🤖 Generated with Claude Code
Note
Medium Risk
Changes default request routing and auth stripping for telemetry SSE, with a control-plane path dependency noted in the PR; the streaming decoder fix is localized but affects all SSE consumers.
Overview
Browser telemetry is now included in the default direct-to-VM routing allowlist (
curlandtelemetry), sobrowsers.telemetry.streamrewrites to the session VM with JWT query auth and noAuthorizationheader unless routing is disabled via env. A new example and routing test cover streaming telemetry from a session.The SSE decoder no longer treats sticky
last_event_idas a reason to emit an event on comment-only keepalive blocks (:\n\n). That fixes idle telemetry streams dying withJSONDecodeErroron empty frames after the first event with anid. A sync/async regression test was added.Reviewed by Cursor Bugbot for commit a0ee2b2. Bugbot is set up for automated code reviews on this repo. Configure here.
Update — idle-stream SSE keepalive fix now included in this PR
This PR also fixes a crash on idle SSE streams. The SSE decoder's empty-block guard (
src/kernel/_streaming.py) includedlast_event_id, which is sticky across events per the SSE spec — so once any event carried an id, every subsequent comment-only block (the server's:\n\nkeepalive, sent every ~15s) dispatched an empty-data event that the typedStreamwrapper then.json()-decoded, raisingJSONDecodeErrorand ending the stream. Droppedlast_event_idfrom the guard; added a regression test (tests/test_streaming.py).Note:
src/kernel/_streaming.pyis Stainless codegen-owned; this is a local patch pending an upstream fix in the generator config.