F084: Bound StreamFilterWriter Lines to ~10 MB

# F084: Bound StreamFilterWriter Lines to ~10 MB

## Scope

### In Scope

- Apply a bounded scanner buffer (10 MB cap) to `StreamFilterWriter` to prevent silent stream abort on oversized NDJSON events
- Structured logging when a line exceeds the cap
- Unit tests covering oversized-line handling (9 MB pass-through, 11 MB graceful degradation)
- Benchmark demonstrating no throughput regression on normal-sized input

### Out of Scope

- Changing the parsing contract (`parseStreamLine` / `DisplayEventSource.ParseEvents`)
- Configurable cap via env var or per-provider override
- Switching from `bufio.Scanner` to `bufio.Reader.ReadBytes('
')` for skip-and-continue semantics

### Deferred

| Item | Rationale | Follow-up |
|------|-----------|-----------|
| Configurable cap (global or per-provider) | Default 10 MB sufficient for real-world transcripts; premature to expose knob | future |
| Manual `bufio.Reader` loop for skip-and-continue | Only needed if tests prove `Scanner` aborts unrecoverably on oversized lines | future |
| Per-parallel-step memory ceiling coordination | Single-stream cap is a local concern; global budgeting is a separate architectural question | future |

---

## User Stories

### US1: Stream Survives Oversized Events (P1 - Must Have)

**As a** workflow operator running an agent step,
**I want** the display stream to continue working after a single oversized NDJSON event,
**So that** a large `content_block_delta` or `tool_use.input` payload does not silently freeze the UI transcript for the rest of the step.

**Why this priority**: Without this cap, `bufio.Scanner` aborts on any line >64 KB, losing every subsequent event. The step still exits 0, so the failure is invisible until someone notices the missing transcript tail. This is the core robustness guarantee F082 needs to hold in production.

**Acceptance Scenarios:**
1. **Given** a provider emits an NDJSON stream containing a 9 MB event followed by normal events, **When** `StreamFilterWriter` processes the stream, **Then** `DisplayOutput` contains the parsed content of the 9 MB event and every subsequent normal event without loss.
2. **Given** a provider emits an NDJSON stream containing an 11 MB event, **When** `StreamFilterWriter` processes the stream, **Then** the oversized line is handled per documented policy (skipped with structured warning, or scan stops with clear log) and no OOM occurs.
3. **Given** a provider emits a stream with only normal-sized events, **When** `StreamFilterWriter` processes it, **Then** behaviour is identical to F082 baseline (no regression).

**Independent Test:** Feed a synthetic byte stream to `StreamFilterWriter` with a 9 MB JSON line and a following 200-byte line; assert both are parsed and forwarded to the line hook.

### US2: Oversized Events Are Observable (P2 - Should Have)

**As a** workflow operator,
**I want** a structured warning in the logs when a stream line exceeds the cap,
**So that** I can detect pathological provider output without digging through missing transcripts.

**Why this priority**: Silent degradation is acceptable as last-resort behaviour, but operators need a signal. Without logging, oversized events remain invisible and we cannot tune the cap or report issues upstream.

**Acceptance Scenarios:**
1. **Given** a stream contains one oversized line, **When** `StreamFilterWriter` encounters it, **Then** a structured warning is logged with the line size and provider identifier.
2. **Given** a stream contains multiple oversized lines (if skip-and-continue is feasible), **When** processed, **Then** each oversized line is logged independently with a counter, not a single collapsed log.

**Independent Test:** Inject a spy logger into `StreamFilterWriter`, feed an 11 MB line, assert the logger received exactly one structured warning with the expected fields.

### US3: Cap Does Not Regress Normal Throughput (P3 - Nice to Have)

**As a** workflow operator running high-volume agent steps,
**I want** the bounded buffer to not slow down normal-sized event processing,
**So that** the robustness fix does not cost us streaming performance in the common case.

**Why this priority**: Pre-allocating a 10 MB buffer could in principle change allocation patterns. A benchmark documents that the common-case path remains fast.

**Acceptance Scenarios:**
1. **Given** a 10 MB stream of small NDJSON events (~200 B each), **When** `BenchmarkStreamFilterWriter_LargeLines` runs, **Then** throughput is within 5% of the F082 baseline.

**Independent Test:** Run `go test -bench=BenchmarkStreamFilterWriter_LargeLines -benchmem` before and after the change; compare ns/op and allocs/op.

### Edge Cases

- What happens when a line is exactly at the 10 MB boundary (10 MB - 1 byte vs 10 MB + 1 byte)?
- How does the system handle a stream where every line is oversized (stress case, logger must not spin unbounded)?
- What is the behavior when the oversized line is the last line in the stream (no trailing newline)?
- How does the system handle a truncated oversized line (EOF mid-payload)?

---

## Requirements

### Functional Requirements

- **FR-001**: System MUST pre-allocate a scanner buffer in `StreamFilterWriter` with a maximum capacity of 10 MB (`10 << 20` bytes).
- **FR-002**: System MUST process NDJSON lines up to 10 MB in length without truncation or scan abort.
- **FR-003**: System MUST log a structured warning when a line exceeds the 10 MB cap, including line size and provider context.
- **FR-004**: System MUST ensure that `cmd.Run` completion status is unaffected by oversized-line events (no new error paths propagated upstream beyond what F082 already defined).
- **FR-005**: System MUST continue processing subsequent lines after an oversized event, OR document in code comments that `bufio.Scanner` aborts unrecoverably and plan a follow-up switch to `bufio.Reader`.

### Non-Functional Requirements

- **NFR-001**: Memory usage per `StreamFilterWriter` instance MUST NOT exceed 10 MB for the scanner buffer (hard cap, no unbounded growth).
- **NFR-002**: Throughput on normal-sized event streams (~200 B per line) MUST remain within 5% of the F082 baseline as measured by `BenchmarkStreamFilterWriter_LargeLines`.
- **NFR-003**: Existing F082 test suite MUST continue to pass without modification.

---

## Success Criteria

- **SC-001**: A synthetic NDJSON stream containing a 9 MB event and 10 subsequent normal events produces `DisplayOutput` containing all 11 parsed events.
- **SC-002**: An 11 MB event in a synthetic stream does not trigger OOM, unbounded buffer growth, or test timeout; graceful degradation is observable via structured log.
- **SC-003**: `BenchmarkStreamFilterWriter_LargeLines` shows throughput regression <5% versus F082 baseline on normal-sized input.
- **SC-004**: Zero new lint violations introduced; `make lint` and `make lint-arch` pass unchanged.

---

## Key Entities

| Entity | Description | Key Attributes |
|--------|-------------|----------------|
| StreamFilterWriter | io.Writer decorator that forwards child-process stdout line-by-line to a provider's line parser | scanner buffer, line hook, logger |
| Oversized line event | A single NDJSON line exceeding the 10 MB cap | line size, provider id, timestamp |

---

## Assumptions

- 10 MB is sufficient for all realistic Claude/Gemini/Codex/OpenCode event payloads observed to date, including batched `content_block_delta`, large `tool_use.input`, and verbose `result` summaries.
- Provider processes are trusted enough that the cap exists as a defence-in-depth measure, not an adversarial guard.
- Logging infrastructure available inside `StreamFilterWriter` (structured logger reachable via provider context) is sufficient; no new port needed.
- `bufio.Scanner` with a pre-sized buffer via `scanner.Buffer(buf, 10<<20)` is the minimal change; switch to `bufio.Reader` is deferred pending test evidence.

---

## Metadata

- **Status**: backlog
- **Version**: v0.8.0
- **Priority**: medium
- **Estimation**: S

## Dependencies

- **Blocked by**: none
- **Unblocks**: none

## Clarifications

_Section populated during clarify step with resolved ambiguities._

## Notes

- Independent of observations 01 (`UsageSource`) and 04 (`DisplayEventSource.ParseEvents`). Ships in any order; the cap applies at the `StreamFilterWriter` layer upstream of whatever parsing contract is in place.
- If tests show `bufio.Scanner` aborts the scan unrecoverably on the first oversized line, open a follow-up to migrate to `bufio.Reader.ReadBytes('
')` with manual size tracking for true skip-and-continue semantics.
- Consider whether a future configurable cap (env var `AWF_STREAM_LINE_MAX`) is warranted once real-world telemetry on oversized-line frequency exists.
- Worst-case memory for N-way parallel steps is N × 10 MB; flagged as a point of attention, not blocking this feature.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

F084: Bound StreamFilterWriter Lines to ~10 MB #316

F084: Bound StreamFilterWriter Lines to ~10 MB

Scope

In Scope

Out of Scope

Deferred

User Stories

US1: Stream Survives Oversized Events (P1 - Must Have)

US2: Oversized Events Are Observable (P2 - Should Have)

US3: Cap Does Not Regress Normal Throughput (P3 - Nice to Have)

Edge Cases

Requirements

Functional Requirements

Non-Functional Requirements

Success Criteria

Key Entities

Assumptions

Metadata

Dependencies

Clarifications

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Item	Rationale	Follow-up
Configurable cap (global or per-provider)	Default 10 MB sufficient for real-world transcripts; premature to expose knob	future
Manual `bufio.Reader` loop for skip-and-continue	Only needed if tests prove `Scanner` aborts unrecoverably on oversized lines	future
Per-parallel-step memory ceiling coordination	Single-stream cap is a local concern; global budgeting is a separate architectural question	future

Entity	Description	Key Attributes
StreamFilterWriter	io.Writer decorator that forwards child-process stdout line-by-line to a provider's line parser	scanner buffer, line hook, logger
Oversized line event	A single NDJSON line exceeding the 10 MB cap	line size, provider id, timestamp

Uh oh!

F084: Bound StreamFilterWriter Lines to ~10 MB #316

Description

F084: Bound StreamFilterWriter Lines to ~10 MB

Scope

In Scope

Out of Scope

Deferred

User Stories

US1: Stream Survives Oversized Events (P1 - Must Have)

US2: Oversized Events Are Observable (P2 - Should Have)

US3: Cap Does Not Regress Normal Throughput (P3 - Nice to Have)

Edge Cases

Requirements

Functional Requirements

Non-Functional Requirements

Success Criteria

Key Entities

Assumptions

Metadata

Dependencies

Clarifications

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions