[kernel-1116] Add CDP Monitor#213
[kernel-1116] Add CDP Monitor#213archandatta wants to merge 6 commits intoarchand/kernel-1116/cdp-pipelinefrom
Conversation
|
Firetiger deploy monitoring skipped This PR didn't match the auto-monitor filter configured on your GitHub connection:
Reason: PR adds new CDP monitor functionality but does not modify API endpoints (packages/api/cmd/api/) or Temporal workflows (packages/api/lib/temporal) as specified in the filter. To monitor this PR anyway, reply with |
| return true | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Suffix-based MIME check inconsistent between capture functions
Medium Severity
isCapturedMIME restricts the +json/+xml/+csv suffix check to only application/vnd.* types, while bodyCapFor applies the same suffix check to all MIME types. This means common structured content types like application/ld+json, application/hal+json, and application/problem+json will be rejected by the isCapturedMIME gatekeeper and never have their bodies captured — even though bodyCapFor was clearly designed to give them the full 8 KB cap. The bodyCapFor comment also incorrectly claims it matches isCapturedMIME's allow-list.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit d73793c. Configure here.
hiroTamada
left a comment
There was a problem hiding this comment.
solid foundation overall — the computed event state machine and timer invalidation logic are well designed. one blocking concern around sensitive data capture in interaction.js.
| })); | ||
| }, true); | ||
|
|
||
| document.addEventListener('keydown', function(e) { |
There was a problem hiding this comment.
the keydown listener captures every keystroke with no filtering — this is effectively a keylogger. since this gets injected into all browser VMs (including sessions where users type passwords via managed auth), sensitive input like passwords, credit card numbers, and SSNs will flow through the event pipeline and get persisted. at minimum, keystrokes in <input type="password"> fields should be redacted or skipped. worth also considering fields with autocomplete="cc-number", name="ssn", etc. the click listener's textContent capture has a similar (lesser) exposure. this has PCI/SOC2/GDPR implications if the data hits storage.
rgarcia
left a comment
There was a problem hiding this comment.
can you combine all the prs into one? I'm finding it hard to review this since most of the code is unused
server/lib/cdpmonitor/types.go
Outdated
| // on the root session (sessionID="") before navigation has been recorded. | ||
| const mainSessionUnset = "\x00unset" | ||
|
|
||
| // Event type constants for all events published by the cdpmonitor. |
There was a problem hiding this comment.
i think this constant block would be a lot easier to reason about with some cosmetic structure: group the names into direct-ish CDP-derived monitor events, computed/synthetic events from computed.go, injected page-side interaction events, and monitor lifecycle/internal events, and add short godoc comments for each group (or for the less obvious constants). right now they're all in one flat list, which makes it hard to tell which names are grounded in protocol counterparts vs introduced locally by the monitor.
server/lib/cdpmonitor/types.go
Outdated
| addedAt time.Time // for TTL eviction | ||
| } | ||
|
|
||
| // cdpConsoleArg is a single Runtime.consoleAPICalled argument. |
There was a problem hiding this comment.
i'd like for this first decode layer to be much closer to the CDP protocol / PDL types, and then have a second translation step into monitor-specific event shapes. right now things like cdpConsoleArg / cdpConsoleParams are custom projections of Runtime.RemoteObject / Runtime.consoleAPICalled, which makes it harder to audit against the source of truth and easier to accidentally drop fields we may want later. i think mirroring the protocol more directly at the boundary would make this foundation easier to reason about, with the domain-specific simplification happening after decode. the upstream source of truth here is the CDP spec / PDL, e.g. js_protocol.pdl.
There was a problem hiding this comment.
Submitted a plan for the changes -> #216
Second layer of the CDP monitor split. Adds the full Monitor struct, all lifecycle machinery, and the test infrastructure. dispatchEvent is a no-op stub - event handlers land in #213 <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Adds substantial new WebSocket lifecycle/reconnect and command routing logic with multiple goroutines/locks; regressions could impact capture stability and leak resources. Tests mitigate but concurrency and reconnect edge cases remain moderately risky. > > **Overview** > Implements the core `cdpmonitor.Monitor` lifecycle: establishes a CDP WebSocket connection, routes command responses via `send()`, starts the init sequence (`Target.setAutoAttach` + attach to existing targets), and adds a `Health()` snapshot for operational visibility. > > Adds automatic reconnect on upstream URL changes with backoff, state reset/unblocking of in-flight commands, and lifecycle events (`monitor_disconnected`, `monitor_reconnected`, `monitor_reconnect_failed`, `monitor_init_failed`). > > Introduces screenshot capture/publishing with rate limiting and optional ffmpeg downscaling, plus a comprehensive test harness (fake WS server/upstream, event collector) and new unit tests covering lifecycle, reconnect, screenshot behavior, and pending-command unblocking. Domain enablement/script injection is factored into `domains.go`, and `dispatchEvent` is stubbed pending the follow-up handlers PR. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit bfb6cd7. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->
| m.computed.onLoadingFinished() // keep netPending consistent | ||
| } | ||
| } | ||
| m.pendReqMu.Unlock() |
There was a problem hiding this comment.
Sweep calls onLoadingFinished while holding pendReqMu
Medium Severity
sweepPendingRequests calls m.computed.onLoadingFinished() inside a loop while holding pendReqMu. Each call acquires computed.mu, decrements netPending, and may restart the network idle timer. When multiple entries expire in one sweep, netPending can be decremented below the actual count (since the sweep-evicted requests were never tracked via onRequest), and the timer is needlessly restarted on each iteration. The onLoadingFinished calls here assume a 1:1 pairing with prior onRequest calls that may never have occurred for stale entries.
Reviewed by Cursor Bugbot for commit 348243a. Configure here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 4 total unresolved issues (including 2 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0605227. Configure here.
| "stack_trace": p.StackTrace, | ||
| }) | ||
| m.publishEvent(EventConsoleLog, events.CategoryConsole, events.Source{Kind: events.KindCDP}, "Runtime.consoleAPICalled", data, sessionID) | ||
| } |
There was a problem hiding this comment.
Console handler ignores event type, mislabels errors
Medium Severity
handleConsole unconditionally publishes every Runtime.consoleAPICalled event as EventConsoleLog, regardless of p.Type. The constant comments explicitly state EventConsoleError is for "Runtime.consoleAPICalled (type=error)", but no code path maps console.error() calls to that event type. Downstream consumers filtering on event type console_error will miss all JS console.error() output. Additionally, the comment on EventConsoleLog incorrectly references "Runtime.exceptionThrown" which is actually handled by handleExceptionThrown and published as EventConsoleError.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 0605227. Configure here.
| PendingRequests: pendReqs, | ||
| Sessions: sessions, | ||
| } | ||
| } |
There was a problem hiding this comment.
Exported Health and MonitorHealth are unused
Low Severity
Health() and MonitorHealth are exported but have no callers anywhere in the codebase outside their own definitions. This adds unused public API surface to the package.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 0605227. Configure here.


Introduces the foundational layer of the CDP monitor as a standalone reviewablechunk. No Monitor struct wiring, just the primitives that everything else builds on.
types.go: CDP wire format (cdpMessage), all event type constants, internal state structs (networkReqState, targetInfo, CDP param shapes).
util.go: Console arg extraction, MIME allow-list (isCapturedMIME), resource type filter (isTextualResource), per-MIME body size caps (bodyCapFor), UTF-8-safe body truncation (truncateBody).
computed.go: State machine for the three derived events: network_idle (500ms debounce after all requests finish), layout_settled (1s after page_load with no layout shifts), navigation_settled (fires once all three flags converge). Timer invalidation via navSeq prevents stale AfterFunc callbacks from publishing for a previous navigation.
domains.go: isPageLikeTarget predicate (pages and iframes get Page.* / PerformanceTimeline.*; workers don't), bindingName constant, interaction.js embed.
interaction.js: Injected script tracking clicks, keydowns, and scroll-settled events via the __kernelEvent CDP binding.
Note
High Risk
Adds a new always-on CDP WebSocket monitor that captures network/console/page events (including response bodies) and injects page JS, which is privacy- and stability-sensitive and could affect production telemetry and resource usage. Also introduces reconnect and timer-based state machines, increasing concurrency/edge-case risk.
Overview
Introduces a new
cdpmonitorimplementation that connects to Chrome DevTools over WebSocket, auto-attaches to targets, enables relevant domains, and dispatches CDP notifications into the existing event pipeline (console logs/errors, network request/response + optional truncated bodies, page lifecycle/navigation, target lifecycle, and layout shifts).Adds synthetic page health signals (
network_idle,layout_settled,navigation_settled) driven by debounce timers and navigation resets, plus injected JS (interaction.js) to emit click/key/scroll events with per-session rate limiting and suppression of sensitive keystrokes.Implements operational behaviors: restart/reconnect handling on upstream URL changes (with backoff and lifecycle events), periodic eviction of stuck pending network requests, and a rate-limited screenshot publisher backed by
ffmpeg(with downscaling). The API service is updated to construct the monitor with asloglogger, and the PR adds extensive fixture-based and integration-style tests for protocol round-tripping, handlers, computed state, lifecycle, reconnect, and screenshots.Reviewed by Cursor Bugbot for commit 0605227. Bugbot is set up for automated code reviews on this repo. Configure here.