Skip to content

[kernel-1116] Add CDP Monitor#213

Open
archandatta wants to merge 6 commits intoarchand/kernel-1116/cdp-pipelinefrom
archand/kernel-1116/cdp-foundation
Open

[kernel-1116] Add CDP Monitor#213
archandatta wants to merge 6 commits intoarchand/kernel-1116/cdp-pipelinefrom
archand/kernel-1116/cdp-foundation

Conversation

@archandatta
Copy link
Copy Markdown
Contributor

@archandatta archandatta commented Apr 13, 2026

Introduces the foundational layer of the CDP monitor as a standalone reviewablechunk. No Monitor struct wiring, just the primitives that everything else builds on.

  • types.go: CDP wire format (cdpMessage), all event type constants, internal state structs (networkReqState, targetInfo, CDP param shapes).

  • util.go: Console arg extraction, MIME allow-list (isCapturedMIME), resource type filter (isTextualResource), per-MIME body size caps (bodyCapFor), UTF-8-safe body truncation (truncateBody).

  • computed.go: State machine for the three derived events: network_idle (500ms debounce after all requests finish), layout_settled (1s after page_load with no layout shifts), navigation_settled (fires once all three flags converge). Timer invalidation via navSeq prevents stale AfterFunc callbacks from publishing for a previous navigation.

  • domains.go: isPageLikeTarget predicate (pages and iframes get Page.* / PerformanceTimeline.*; workers don't), bindingName constant, interaction.js embed.

  • interaction.js: Injected script tracking clicks, keydowns, and scroll-settled events via the __kernelEvent CDP binding.


Note

High Risk
Adds a new always-on CDP WebSocket monitor that captures network/console/page events (including response bodies) and injects page JS, which is privacy- and stability-sensitive and could affect production telemetry and resource usage. Also introduces reconnect and timer-based state machines, increasing concurrency/edge-case risk.

Overview
Introduces a new cdpmonitor implementation that connects to Chrome DevTools over WebSocket, auto-attaches to targets, enables relevant domains, and dispatches CDP notifications into the existing event pipeline (console logs/errors, network request/response + optional truncated bodies, page lifecycle/navigation, target lifecycle, and layout shifts).

Adds synthetic page health signals (network_idle, layout_settled, navigation_settled) driven by debounce timers and navigation resets, plus injected JS (interaction.js) to emit click/key/scroll events with per-session rate limiting and suppression of sensitive keystrokes.

Implements operational behaviors: restart/reconnect handling on upstream URL changes (with backoff and lifecycle events), periodic eviction of stuck pending network requests, and a rate-limited screenshot publisher backed by ffmpeg (with downscaling). The API service is updated to construct the monitor with a slog logger, and the PR adds extensive fixture-based and integration-style tests for protocol round-tripping, handlers, computed state, lifecycle, reconnect, and screenshots.

Reviewed by Cursor Bugbot for commit 0605227. Bugbot is set up for automated code reviews on this repo. Configure here.

@archandatta archandatta marked this pull request as ready for review April 13, 2026 14:56
@firetiger-agent
Copy link
Copy Markdown

Firetiger deploy monitoring skipped

This PR didn't match the auto-monitor filter configured on your GitHub connection:

Any PR that changes the kernel API. Monitor changes to API endpoints (packages/api/cmd/api/) and Temporal workflows (packages/api/lib/temporal) in the kernel repo

Reason: PR adds new CDP monitor functionality but does not modify API endpoints (packages/api/cmd/api/) or Temporal workflows (packages/api/lib/temporal) as specified in the filter.

To monitor this PR anyway, reply with @firetiger monitor this.

return true
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suffix-based MIME check inconsistent between capture functions

Medium Severity

isCapturedMIME restricts the +json/+xml/+csv suffix check to only application/vnd.* types, while bodyCapFor applies the same suffix check to all MIME types. This means common structured content types like application/ld+json, application/hal+json, and application/problem+json will be rejected by the isCapturedMIME gatekeeper and never have their bodies captured — even though bodyCapFor was clearly designed to give them the full 8 KB cap. The bodyCapFor comment also incorrectly claims it matches isCapturedMIME's allow-list.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d73793c. Configure here.

Copy link
Copy Markdown
Contributor

@hiroTamada hiroTamada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solid foundation overall — the computed event state machine and timer invalidation logic are well designed. one blocking concern around sensitive data capture in interaction.js.

}));
}, true);

document.addEventListener('keydown', function(e) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the keydown listener captures every keystroke with no filtering — this is effectively a keylogger. since this gets injected into all browser VMs (including sessions where users type passwords via managed auth), sensitive input like passwords, credit card numbers, and SSNs will flow through the event pipeline and get persisted. at minimum, keystrokes in <input type="password"> fields should be redacted or skipped. worth also considering fields with autocomplete="cc-number", name="ssn", etc. the click listener's textContent capture has a similar (lesser) exposure. this has PCI/SOC2/GDPR implications if the data hits storage.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed -> a62c403

Copy link
Copy Markdown
Contributor

@rgarcia rgarcia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you combine all the prs into one? I'm finding it hard to review this since most of the code is unused

// on the root session (sessionID="") before navigation has been recorded.
const mainSessionUnset = "\x00unset"

// Event type constants for all events published by the cdpmonitor.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this constant block would be a lot easier to reason about with some cosmetic structure: group the names into direct-ish CDP-derived monitor events, computed/synthetic events from computed.go, injected page-side interaction events, and monitor lifecycle/internal events, and add short godoc comments for each group (or for the less obvious constants). right now they're all in one flat list, which makes it hard to tell which names are grounded in protocol counterparts vs introduced locally by the monitor.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addedAt time.Time // for TTL eviction
}

// cdpConsoleArg is a single Runtime.consoleAPICalled argument.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd like for this first decode layer to be much closer to the CDP protocol / PDL types, and then have a second translation step into monitor-specific event shapes. right now things like cdpConsoleArg / cdpConsoleParams are custom projections of Runtime.RemoteObject / Runtime.consoleAPICalled, which makes it harder to audit against the source of truth and easier to accidentally drop fields we may want later. i think mirroring the protocol more directly at the boundary would make this foundation easier to reason about, with the domain-specific simplification happening after decode. the upstream source of truth here is the CDP spec / PDL, e.g. js_protocol.pdl.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitted a plan for the changes -> #216

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed -> 0605227

Second layer of the CDP monitor split. Adds the full Monitor struct, all
lifecycle machinery, and the test infrastructure. dispatchEvent is a
no-op stub - event handlers land in #213

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Adds substantial new WebSocket lifecycle/reconnect and command routing
logic with multiple goroutines/locks; regressions could impact capture
stability and leak resources. Tests mitigate but concurrency and
reconnect edge cases remain moderately risky.
> 
> **Overview**
> Implements the core `cdpmonitor.Monitor` lifecycle: establishes a CDP
WebSocket connection, routes command responses via `send()`, starts the
init sequence (`Target.setAutoAttach` + attach to existing targets), and
adds a `Health()` snapshot for operational visibility.
> 
> Adds automatic reconnect on upstream URL changes with backoff, state
reset/unblocking of in-flight commands, and lifecycle events
(`monitor_disconnected`, `monitor_reconnected`,
`monitor_reconnect_failed`, `monitor_init_failed`).
> 
> Introduces screenshot capture/publishing with rate limiting and
optional ffmpeg downscaling, plus a comprehensive test harness (fake WS
server/upstream, event collector) and new unit tests covering lifecycle,
reconnect, screenshot behavior, and pending-command unblocking. Domain
enablement/script injection is factored into `domains.go`, and
`dispatchEvent` is stubbed pending the follow-up handlers PR.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
bfb6cd7. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
m.computed.onLoadingFinished() // keep netPending consistent
}
}
m.pendReqMu.Unlock()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweep calls onLoadingFinished while holding pendReqMu

Medium Severity

sweepPendingRequests calls m.computed.onLoadingFinished() inside a loop while holding pendReqMu. Each call acquires computed.mu, decrements netPending, and may restart the network idle timer. When multiple entries expire in one sweep, netPending can be decremented below the actual count (since the sweep-evicted requests were never tracked via onRequest), and the timer is needlessly restarted on each iteration. The onLoadingFinished calls here assume a 1:1 pairing with prior onRequest calls that may never have occurred for stale entries.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 348243a. Configure here.

@archandatta archandatta changed the title [kernel-1116] Setup CDP Monitor Foundation [kernel-1116] Add CDP Monitor Apr 13, 2026
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 4 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0605227. Configure here.

"stack_trace": p.StackTrace,
})
m.publishEvent(EventConsoleLog, events.CategoryConsole, events.Source{Kind: events.KindCDP}, "Runtime.consoleAPICalled", data, sessionID)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Console handler ignores event type, mislabels errors

Medium Severity

handleConsole unconditionally publishes every Runtime.consoleAPICalled event as EventConsoleLog, regardless of p.Type. The constant comments explicitly state EventConsoleError is for "Runtime.consoleAPICalled (type=error)", but no code path maps console.error() calls to that event type. Downstream consumers filtering on event type console_error will miss all JS console.error() output. Additionally, the comment on EventConsoleLog incorrectly references "Runtime.exceptionThrown" which is actually handled by handleExceptionThrown and published as EventConsoleError.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 0605227. Configure here.

PendingRequests: pendReqs,
Sessions: sessions,
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exported Health and MonitorHealth are unused

Low Severity

Health() and MonitorHealth are exported but have no callers anywhere in the codebase outside their own definitions. This adds unused public API surface to the package.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 0605227. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants