Skip to content

[kernel-1116] browser logging: Event Schema & Pipeline#184

Merged
archandatta merged 22 commits intomainfrom
archand/kernel-1116/browser-logging
Apr 2, 2026
Merged

[kernel-1116] browser logging: Event Schema & Pipeline#184
archandatta merged 22 commits intomainfrom
archand/kernel-1116/browser-logging

Conversation

@archandatta
Copy link
Copy Markdown
Contributor

@archandatta archandatta commented Mar 23, 2026

Note

Medium Risk
Introduces new concurrent event ingestion/storage primitives (ring buffer fan-out, file appends, truncation) where correctness depends on sequencing and thread-safety. Risk is moderate due to new concurrency and I/O behavior, though scope is isolated and covered by extensive tests.

Overview
Adds a new server/lib/events logging pipeline with a portable Event/Envelope JSON schema (including source metadata, detail levels, and 1MB record-size enforcement via truncation).

Introduces CaptureSession as the single write path that assigns monotonically increasing sequence numbers, defaults timestamps/detail level, writes durable per-category JSONL logs via FileWriter, and publishes to an in-memory RingBuffer for non-blocking fan-out to multiple readers with explicit drop notifications on overflow.

Includes a comprehensive new test suite covering serialization, ring buffer overflow/resume semantics, concurrent readers/writers, file routing/lazy open, and capture-session sequencing/truncation behavior.

Written by Cursor Bugbot for commit 5d958df. This will update automatically on new commits. Configure here.

@archandatta archandatta marked this pull request as ready for review March 24, 2026 12:50
@archandatta archandatta force-pushed the archand/kernel-1116/browser-logging branch from 09ed5ed to 29f2bbf Compare March 24, 2026 13:21
Copy link
Copy Markdown
Contributor

@rgarcia rgarcia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice direction on the pipeline/schema. one thing i'd consider before cementing BrowserEvent v1: if this is eventually going to drive both capture controls (for example: "turn on cdp console only" or "turn on network request/response capture at headers-only detail") and subscriptions, it might be worth making those selector dimensions first-class in the envelope instead of encoding too much into a single type string.

concretely, i think i'd lean toward:

  • keeping the primary event identity semantic, e.g. console.log, network.request, input.click
  • adding explicit provenance fields like source_kind (cdp, kernel_api, extension, local_process) plus source_name / source_event (for example Runtime.consoleAPICalled)
  • adding an explicit detail_level (minimal, default, verbose, raw)
  • possibly making category first-class too instead of deriving it from the type prefix

i probably would not use raw Runtime.* / Network.* as the primary type, since that makes future non-cdp producers feel awkward/second-class. i think the semantic-type + provenance split ages better if we later want to emit events from things like:

  • third-party extensions running in the browser and talking to localhost
  • vm-local helper processes/programs running alongside the browser
  • server/api-driven tool actions like screenshot/input/recording events

that shape also gives the system a much more natural control surface for both capture config and subscriptions, since selectors can operate directly on stable fields like category, topic, source_kind, and detail_level instead of needing to parse overloaded event names.

Copy link
Copy Markdown
Contributor

@Sayan- Sayan- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

focused review on the pipeline since raf had some feedback to the event schema!

@archandatta archandatta force-pushed the archand/kernel-1116/browser-logging branch from 29f2bbf to 997edb4 Compare March 27, 2026 11:38
@archandatta archandatta force-pushed the archand/kernel-1116/browser-logging branch from b9a88df to 1644fe7 Compare March 27, 2026 13:48
@archandatta archandatta requested review from Sayan- and rgarcia March 30, 2026 19:35
Copy link
Copy Markdown
Contributor

@Sayan- Sayan- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes look great!

r.nextSeq = oldest
r.rb.mu.RUnlock()
data := json.RawMessage(fmt.Sprintf(`{"dropped":%d}`, dropped))
return BrowserEvent{Type: "events.dropped", Category: CategorySystem, Source: SourceKernelAPI, Data: data}, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On final review I'm wondering if the drop sentinel should be a separate type from BrowserEvent. A drop notification is stream metadata, not browser content.

A deeper dive a la opus:

The current sentinel has fabricated/zero fields (Seq: 0, Ts: 0, Category: "system", Source: "kernel_api") that aren't actually observed from the browser. Since there are no consumers yet, now's the cheapest time to split this:

type ReadResult struct {
    Event   *BrowserEvent // nil when Dropped > 0
    Dropped uint64        // count of events lost at this point
}

This makes the contract compile-time safe consumers can't accidentally serialize a drop as a real event because Event is nil. The transport layer (e.g. a future WebSocket handler) can decide how to represent drops on the wire independently.

mu sync.Mutex
ring *RingBuffer
files *FileWriter
seq atomic.Uint64
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be simpler to just make this an uint64 since we only mutate it while holding the lock. nbd

"time"
)

// Pipeline glues a RingBuffer and a FileWriter into a single write path
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be helpful to capture the intended use of the pipeline wrt lifecycle (e.g. can you stop and restart the same pipeline, do you need to send a terminal event before closing, etc)


// Start sets the capture session ID that will be stamped on every subsequent
// published event
func (p *Pipeline) Start(captureSessionID string) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we're still missing a separation between subscription selectors and native producer config. some producers will be native to the image server and controllable by us (like cdp), while others may be external and not controllable at all. because of that, i'm not sure subscriptions should implicitly mean "turn this producer on/reconfigure it". i think the api may want subscriptions to stay a pure consumer-side filter, with native producer enablement/config modeled explicitly and separately.

Ts int64 `json:"ts"`
Type string `json:"type"`
Category EventCategory `json:"category"`
Source Source `json:"source"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is starting to feel like provenance wants to be an object rather than multiple top-level fields. i'd consider making source structured and moving cdp-specific context under source.metadata, e.g.

{
  "type": "console.log",
  "category": "console",
  "detail_level": "standard",
  "source": {
    "kind": "cdp",
    "event": "Runtime.consoleAPICalled",
    "metadata": {
      "target_id": "...",
      "cdp_session_id": "...",
      "frame_id": "...",
      "parent_frame_id": "..."
    }
  },
  "data": { ... }
}

that keeps the top level focused on stable cross-producer fields and makes non-cdp producers feel first-class too.


// BrowserEvent is the canonical event structure for the browser capture pipeline.
type BrowserEvent struct {
CaptureSessionID string `json:"capture_session_id"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is capture_session_id intended to identify? this reads more like pipeline/native-producer lifecycle metadata than part of the portable event schema. if resume/subscription are meant to work across heterogeneous producers, i'd be careful about baking a producer-owned session id into every event.

// If the ring has already published events, the reader will receive an
// events_dropped BrowserEvent on the first Read call if it has fallen behind
// the oldest retained event
func (rb *RingBuffer) NewReader() *Reader {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the resumable story wants to show up in the api here eventually. right now NewReader() always starts from 0, so there isn't yet an explicit way to resume from a previously persisted cursor/offset after reconnect.

// Reader tracks an independent read position in a RingBuffer.
type Reader struct {
rb *RingBuffer
nextSeq uint64 // publish index, not BrowserEvent.Seq
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more fundamentally, i think Event.Seq probably wants to be the public stream offset / resume token. right now the reader tracks a separate publish-index concept (nextSeq / written), which creates two seq-like notions in the model. i'd strongly consider aligning the reader/ring terminology and api around Event.Seq instead of introducing a second cursor concept.

r.nextSeq = oldest
r.rb.mu.RUnlock()
data := json.RawMessage(fmt.Sprintf(`{"dropped":%d}`, dropped))
return BrowserEvent{Type: "events.dropped", Category: CategorySystem, Source: SourceKernelAPI, Data: data}, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm a little nervous about representing drops as a normal Event. this feels like stream/transport metadata rather than producer-emitted content. maybe Read should return a separate result type, e.g. ReadResult{Event *Event, Dropped uint64}, where Event is nil when the reader has fallen behind. then the transport layer can choose how to surface drops on the wire without baking them into the canonical event schema.


const (
DetailMinimal DetailLevel = "minimal"
DetailDefault DetailLevel = "default"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if this is meant to be a concrete level, i'd consider renaming default to standard. minimal/standard/verbose/raw reads like an actual ladder, whereas default feels more like a policy alias whose meaning could vary by producer.

)

// BrowserEvent is the canonical event structure for the browser capture pipeline.
type BrowserEvent struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i wonder if Event would be a cleaner name than BrowserEvent. it feels more future-proof if extension/local-process/server-native producers end up as first-class peers.

mu sync.RWMutex
buf []BrowserEvent
head int // next write position (mod cap)
written uint64 // total ever published (monotonic)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we make this comment a bit more explicit? "written" took me a second to parse. something like "monotonic count of publishes into the logical stream" would make it clearer that this is not bytes-written and not Event.Seq.

buf []BrowserEvent
head int // next write position (mod cap)
written uint64 // total ever published (monotonic)
notify chan struct{}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "notify" feels a bit ambiguous here since it doesn't say who is being notified. i'd consider renaming this to something reader-scoped like readerWake or readerNotify, plus a short comment that it's closed-and-replaced on each publish to wake blocked readers.

Event is the agreed portable name. DetailStandard avoids Go keyword
ambiguity with "default".
Moves CDP-specific fields (target_id, cdp_session_id, frame_id,
parent_frame_id) under source.metadata. Top-level Event schema now
contains only stable cross-producer fields.
…ut of Event

Event is now purely producer-emitted content. Pipeline-assigned metadata
(seq, capture_session_id) lives on the Envelope. truncateIfNeeded
operates on the full Envelope. Pipeline type comment now documents
lifecycle semantics.
Ring buffer now indexes by envelope.Seq directly, removing the separate
head/written counters. NewReader takes an explicit afterSeq for resume
support. Renamed notify to readerWake for clarity.
Drops are now stream metadata (ReadResult.Dropped) rather than fake
events smuggled into the Event schema. Transport layer decides how to
surface gaps on the wire.
@archandatta
Copy link
Copy Markdown
Contributor Author

archandatta commented Apr 1, 2026

Thanks for the feedback it all makes sense. I think the key pieces to update are the first class event structure to better capture and the producer consumer pattern which def keeps things a lot cleaner. Here is a quick summary of the new changes:

  • Envelope wrapper: seq and capture_session_id are pipeline-assigned metadata, not producer-emitted content.

    • Extracted them into -> Envelope{CaptureSessionID, Seq, Event} struct. Event is now purely portable, external producers(extensions, local processes) don't need to know about pipeline sequencing or capture sessions. On disk the JSONL record is the full envelope; in memory the ring stores envelopes indexed directly by seq
  • Producer/consumer separation: subscriptions are now a pure consumer-side filter (category, detail_level, source kind). They never start, stop, or reconfigure a producer. Native producer config (CDP domains, throttling) is modeled separately and only applies to producers we control

  • Structured source: created source.metadata, top-level Event only has stable cross-producer fields; CDP-specific context lives under source.metadata, this way for example cdp context does not pollute other events

  • Unified seq: ring buffer now indexes by envelope.Seq directly, eliminating the separate written/head counters. NewReader takes an explicit resume position. One cursor concept everywhere: ring index, reader position, JSONL offset, client reconnect token

  • ReadResult: Drops are stream metadata, transport layer decides how to surface gaps on the wire

  • Renamed BrowserEvent → Event, DetailDefault → DetailStandard, notify → readerWake

@archandatta archandatta requested review from Sayan- and rgarcia April 1, 2026 11:55
truncateIfNeeded now warns if the envelope still exceeds the 1MB limit
after nulling data (e.g. huge url or source.metadata). Pipeline.Publish
skips the file write when marshal returns nil to avoid writing corrupt
bare-newline JSONL lines.
}
}
return firstErr
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileWriter.Close leaks descriptors on post-close writes

Low Severity

FileWriter.Close closes all file handles but doesn't clear the files map or mark itself as closed. If Write is later called for a new category, a new file is opened and stored in the map, but no subsequent Close will ever run — leaking the file descriptor. For an existing category, a write to the already-closed handle silently fails.

Fix in Cursor Fix in Web

Copy link
Copy Markdown
Contributor

@rgarcia rgarcia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

// them out to a FileWriter (durable) and RingBuffer (in-memory). Call Start
// once with a capture session ID, then Publish concurrently. Close flushes the
// FileWriter; there is no restart or terminal event.
type Pipeline struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one design question: now that this object owns seq allocation, envelope stamping, reader creation, and capture_session_id, i wonder if Pipeline is too generic a name. it feels more like a session-scoped event stream than a reusable pipeline primitive. would something like CaptureSession or CaptureStream be clearer? relatedly, if capture_session_id is meant to identify exactly one capture session, i think i'd prefer passing it into the constructor and dropping Start(), so the type can't exist in a pre-start / empty-session state and can't accidentally be reused across sessions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

went with CaptureSession!

// Event is the portable event schema. It contains only producer-emitted content;
// pipeline metadata (seq, capture session) lives on the Envelope.
type Event struct {
Ts int64 `json:"ts"`
Copy link
Copy Markdown
Contributor

@rgarcia rgarcia Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we document ts explicitly? right now it isn't obvious whether this is unix seconds, millis, or nanos. i do wonder if millisecond resolution is a bit coarse for this stream since multiple events can plausibly share the same ms; if we want higher fidelity, maybe consider unix micros.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@archandatta archandatta force-pushed the archand/kernel-1116/browser-logging branch from 6b0f639 to bf091f5 Compare April 2, 2026 15:14
@archandatta archandatta merged commit 1c77850 into main Apr 2, 2026
5 checks passed
@archandatta archandatta deleted the archand/kernel-1116/browser-logging branch April 2, 2026 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants