Skip to content

read tool reads entire file regardless of MAX_BYTES limit (200× slowdown after #27155) #27864

@aschina

Description

@aschina

Description

PR #27155 (effect(patch,tool): migrate patch/index and tool/read to AppFileSystem, commit aa8a41d1b, merged 2026-05-14) introduced a regression in packages/opencode/src/tool/read.ts. The read tool now reads the entire file from disk regardless of the MAX_BYTES early-termination limit. For large files this manifests as the tool appearing to "hang" — depending on file size and shape it ranges from a noticeable pause to multi-second freezes to OOM-style stalls on files with very long lines or no newlines (minified bundles, large CSV/JSON/lockfiles, log files).

Root cause

The migration replaced createReadStream + readline + for-await + break with an Effect Stream pipeline. The original code's break exited the consumer and tore down the underlying read stream. The new code sets a flags.done = true boolean that only short-circuits work inside the runForEach callback — it does not stop the upstream stream, so the entire file is still pulled from disk and passed through Stream.splitLines.

// packages/opencode/src/tool/read.ts (current)
yield* fs.stream(filepath).pipe(
  Stream.map((bytes) => decoder.decode(bytes, { stream: true })),
  Stream.splitLines,
  Stream.runForEach((text) =>
    Effect.sync(() => {
      if (flags.done) return                    // skips work, doesn't stop upstream
      ...
      if (flags.bytes + size > MAX_BYTES) {
        flags.done = true
        return                                  // no break / fail / interrupt
      }
      ...
    }),
  ),
)

The PR description acknowledges the change ("The done flag in the callback replaces the original break") but doesn't account for the fact that exiting a for-await loop is not equivalent to setting a flag in a runForEach callback — the latter doesn't propagate to the producer.

Reproduction

I reproduced this against effect@4.0.0-beta.65 + @effect/platform-node-shared@4.0.0-beta.48 (the versions pinned in this repo) using a 500 MB text file with normal-length lines (80 bytes each). Same MAX_BYTES = 50 KB, same limit = 2000, same offset = 1:

Implementation Wall time RSS Lines actually consumed by splitLines
Old (createReadStream + readline + break) 4 ms 92 MB ~640 (stops after MAX_BYTES)
New (Stream.runForEach + done flag) 811 ms 204 MB 6,553,600 (entire file)

That's a ~200× slowdown for any file substantially larger than MAX_BYTES. Extrapolated:

  • 5 GB log/lockfile → ~8 s perceived freeze
  • 50 GB → ~80 s, plus likely OOM from V8 string growth
  • File with very long lines or no newlines (minified JS, single-line JSON/CSV, single-row dumps that pass the binary-detection sample): Stream.splitLines accumulates internally in stringBuilder; memory grows O(filesize) and string concatenation becomes O(n²) before it ever finishes.

Concurrently with an LLM call, the user sees OpenCode "hang" on a read.

Affected scope

Any read invocation on a file larger than MAX_BYTES (50 KB by default), particularly:

  • Large lockfiles (bun.lock, package-lock.json, pnpm-lock.yaml)
  • Compiled / minified JavaScript bundles
  • Large logs and CSV/JSONL dumps
  • Snapshot files (e.g. packages/core/src/models-snapshot.js is 2.4 MB)
  • Any binary that slips past the 4096-byte sample heuristic

Suggested fixes

The fix needs to actually stop the upstream pull when MAX_BYTES is reached. A few options, in order of risk:

A. Tagged-error short-circuit (smallest behavioral change):

class ReadStop extends Schema.TaggedErrorClass<ReadStop>()("ReadStop", {}) {}

yield* fs.stream(filepath).pipe(
  Stream.map(...),
  Stream.splitLines,
  Stream.runForEach((text) =>
    flags.done
      ? new ReadStop()
      : Effect.sync(() => { /* per-line work */ }),
  ),
).pipe(Effect.catchTag("ReadStop", () => Effect.void))

B. Byte-bounded Stream.takeWhile upstream of splitLines:

let bytesPulled = 0
yield* fs.stream(filepath).pipe(
  Stream.takeWhile((chunk) => {
    bytesPulled += chunk.length
    return bytesPulled <= MAX_BYTES + chunkSize  // one extra chunk for last-line completion
  }),
  Stream.map(...),
  Stream.splitLines,
  Stream.runForEach(...),
)

Needs care around split UTF-8 sequences across chunk boundaries (the manual TextDecoder({ stream: true }) already handles this).

C. Revert just the lines helper to createReadStream + readline (lowest risk):

Keep the rest of #27155 (the patch/index.ts migration is fine); only the line-streaming helper had a working pre-migration form. Trade-off: tool/read.ts would no longer be 100% AppFileSystem-routed.

Related: shell tool finalizer has no timeout

While reviewing, I noticed #27517 (fix(tool): close shell truncation stream, commit e26abd8da) added a finalizer in packages/opencode/src/tool/shell.ts that awaits stream.end(...) + 'finish' / 'close' / 'error' events on every truncation sink — with no timeout:

yield* Effect.promise(
  () =>
    new Promise<void>((resolve) => {
      ...
      stream.once("close", done)
      stream.once("error", done)
      stream.once("finish", done)
      stream.end(done)
    }),
).pipe(Effect.catch(() => Effect.void))   // dead code: Effect.promise never fails

In rare cases (disk pressure, EBADF, file deleted during write) the write stream may not emit any of those events, leaving the finalizer hanging forever. The scope can't release, the shell tool call hangs, the user sees "OpenCode froze on a shell command." Adding Effect.race(..., Effect.sleep("3 seconds")) would bound it.

This is a much rarer hang than the read regression, but worth mentioning since I encountered it during the same investigation.

Plugins

No response

OpenCode version

1.15.1 (also affects devread.ts line-streaming code is unchanged since #27155)

Steps to reproduce

  1. Create a large text file: dd if=/dev/zero bs=1M count=500 2>/dev/null | tr '\0' 'a' > /tmp/big.txt (or any large lockfile / log)
  2. Ask OpenCode to read /tmp/big.txt
  3. Observe the freeze / latency. Compare against 1.14.x (pre-effect(patch,tool): migrate patch/index and tool/read to AppFileSystem #27155) for the original behavior.

Operating System

macOS 26.5 (arm64), but the bug is platform-independent — it's purely in the Effect Stream consumer logic.

Terminal

N/A — affects every TUI / desktop / CLI invocation of the read tool.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions