Skip to content

F084: Bound StreamFilterWriter Lines to ~10 MB #316

@pocky

Description

@pocky

F084: Bound StreamFilterWriter Lines to ~10 MB

Scope

In Scope

  • Apply a bounded scanner buffer (10 MB cap) to StreamFilterWriter to prevent silent stream abort on oversized NDJSON events
  • Structured logging when a line exceeds the cap
  • Unit tests covering oversized-line handling (9 MB pass-through, 11 MB graceful degradation)
  • Benchmark demonstrating no throughput regression on normal-sized input

Out of Scope

  • Changing the parsing contract (parseStreamLine / DisplayEventSource.ParseEvents)
  • Configurable cap via env var or per-provider override
  • Switching from bufio.Scanner to bufio.Reader.ReadBytes(' ') for skip-and-continue semantics

Deferred

Item Rationale Follow-up
Configurable cap (global or per-provider) Default 10 MB sufficient for real-world transcripts; premature to expose knob future
Manual bufio.Reader loop for skip-and-continue Only needed if tests prove Scanner aborts unrecoverably on oversized lines future
Per-parallel-step memory ceiling coordination Single-stream cap is a local concern; global budgeting is a separate architectural question future

User Stories

US1: Stream Survives Oversized Events (P1 - Must Have)

As a workflow operator running an agent step,
I want the display stream to continue working after a single oversized NDJSON event,
So that a large content_block_delta or tool_use.input payload does not silently freeze the UI transcript for the rest of the step.

Why this priority: Without this cap, bufio.Scanner aborts on any line >64 KB, losing every subsequent event. The step still exits 0, so the failure is invisible until someone notices the missing transcript tail. This is the core robustness guarantee F082 needs to hold in production.

Acceptance Scenarios:

  1. Given a provider emits an NDJSON stream containing a 9 MB event followed by normal events, When StreamFilterWriter processes the stream, Then DisplayOutput contains the parsed content of the 9 MB event and every subsequent normal event without loss.
  2. Given a provider emits an NDJSON stream containing an 11 MB event, When StreamFilterWriter processes the stream, Then the oversized line is handled per documented policy (skipped with structured warning, or scan stops with clear log) and no OOM occurs.
  3. Given a provider emits a stream with only normal-sized events, When StreamFilterWriter processes it, Then behaviour is identical to F082 baseline (no regression).

Independent Test: Feed a synthetic byte stream to StreamFilterWriter with a 9 MB JSON line and a following 200-byte line; assert both are parsed and forwarded to the line hook.

US2: Oversized Events Are Observable (P2 - Should Have)

As a workflow operator,
I want a structured warning in the logs when a stream line exceeds the cap,
So that I can detect pathological provider output without digging through missing transcripts.

Why this priority: Silent degradation is acceptable as last-resort behaviour, but operators need a signal. Without logging, oversized events remain invisible and we cannot tune the cap or report issues upstream.

Acceptance Scenarios:

  1. Given a stream contains one oversized line, When StreamFilterWriter encounters it, Then a structured warning is logged with the line size and provider identifier.
  2. Given a stream contains multiple oversized lines (if skip-and-continue is feasible), When processed, Then each oversized line is logged independently with a counter, not a single collapsed log.

Independent Test: Inject a spy logger into StreamFilterWriter, feed an 11 MB line, assert the logger received exactly one structured warning with the expected fields.

US3: Cap Does Not Regress Normal Throughput (P3 - Nice to Have)

As a workflow operator running high-volume agent steps,
I want the bounded buffer to not slow down normal-sized event processing,
So that the robustness fix does not cost us streaming performance in the common case.

Why this priority: Pre-allocating a 10 MB buffer could in principle change allocation patterns. A benchmark documents that the common-case path remains fast.

Acceptance Scenarios:

  1. Given a 10 MB stream of small NDJSON events (~200 B each), When BenchmarkStreamFilterWriter_LargeLines runs, Then throughput is within 5% of the F082 baseline.

Independent Test: Run go test -bench=BenchmarkStreamFilterWriter_LargeLines -benchmem before and after the change; compare ns/op and allocs/op.

Edge Cases

  • What happens when a line is exactly at the 10 MB boundary (10 MB - 1 byte vs 10 MB + 1 byte)?
  • How does the system handle a stream where every line is oversized (stress case, logger must not spin unbounded)?
  • What is the behavior when the oversized line is the last line in the stream (no trailing newline)?
  • How does the system handle a truncated oversized line (EOF mid-payload)?

Requirements

Functional Requirements

  • FR-001: System MUST pre-allocate a scanner buffer in StreamFilterWriter with a maximum capacity of 10 MB (10 << 20 bytes).
  • FR-002: System MUST process NDJSON lines up to 10 MB in length without truncation or scan abort.
  • FR-003: System MUST log a structured warning when a line exceeds the 10 MB cap, including line size and provider context.
  • FR-004: System MUST ensure that cmd.Run completion status is unaffected by oversized-line events (no new error paths propagated upstream beyond what F082 already defined).
  • FR-005: System MUST continue processing subsequent lines after an oversized event, OR document in code comments that bufio.Scanner aborts unrecoverably and plan a follow-up switch to bufio.Reader.

Non-Functional Requirements

  • NFR-001: Memory usage per StreamFilterWriter instance MUST NOT exceed 10 MB for the scanner buffer (hard cap, no unbounded growth).
  • NFR-002: Throughput on normal-sized event streams (~200 B per line) MUST remain within 5% of the F082 baseline as measured by BenchmarkStreamFilterWriter_LargeLines.
  • NFR-003: Existing F082 test suite MUST continue to pass without modification.

Success Criteria

  • SC-001: A synthetic NDJSON stream containing a 9 MB event and 10 subsequent normal events produces DisplayOutput containing all 11 parsed events.
  • SC-002: An 11 MB event in a synthetic stream does not trigger OOM, unbounded buffer growth, or test timeout; graceful degradation is observable via structured log.
  • SC-003: BenchmarkStreamFilterWriter_LargeLines shows throughput regression <5% versus F082 baseline on normal-sized input.
  • SC-004: Zero new lint violations introduced; make lint and make lint-arch pass unchanged.

Key Entities

Entity Description Key Attributes
StreamFilterWriter io.Writer decorator that forwards child-process stdout line-by-line to a provider's line parser scanner buffer, line hook, logger
Oversized line event A single NDJSON line exceeding the 10 MB cap line size, provider id, timestamp

Assumptions

  • 10 MB is sufficient for all realistic Claude/Gemini/Codex/OpenCode event payloads observed to date, including batched content_block_delta, large tool_use.input, and verbose result summaries.
  • Provider processes are trusted enough that the cap exists as a defence-in-depth measure, not an adversarial guard.
  • Logging infrastructure available inside StreamFilterWriter (structured logger reachable via provider context) is sufficient; no new port needed.
  • bufio.Scanner with a pre-sized buffer via scanner.Buffer(buf, 10<<20) is the minimal change; switch to bufio.Reader is deferred pending test evidence.

Metadata

  • Status: backlog
  • Version: v0.8.0
  • Priority: medium
  • Estimation: S

Dependencies

  • Blocked by: none
  • Unblocks: none

Clarifications

Section populated during clarify step with resolved ambiguities.

Notes

  • Independent of observations 01 (UsageSource) and 04 (DisplayEventSource.ParseEvents). Ships in any order; the cap applies at the StreamFilterWriter layer upstream of whatever parsing contract is in place.
  • If tests show bufio.Scanner aborts the scan unrecoverably on the first oversized line, open a follow-up to migrate to bufio.Reader.ReadBytes(' ') with manual size tracking for true skip-and-continue semantics.
  • Consider whether a future configurable cap (env var AWF_STREAM_LINE_MAX) is warranted once real-world telemetry on oversized-line frequency exists.
  • Worst-case memory for N-way parallel steps is N × 10 MB; flagged as a point of attention, not blocking this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions