Bug: O(n²) string concatenation in readLines causes OOM on large stdout
Summary
The readLines() function in @e2b/code-interpreter uses buffer += chunk to accumulate the HTTP response body. This is O(n²) string concatenation in JavaScript — each += copies the entire existing buffer plus the new chunk into a fresh string. For large stdout outputs (>1 MB), this causes massive memory amplification and multi-second event loop stalls, leading to OOM kills on the host process.
Environment
@e2b/code-interpreter: 2.3.3 (also confirmed on 2.4.0 — same code)
- Node.js: v22
- OS: Linux (Kubernetes pods, 4 GB memory limit)
Reproduction
- Create a sandbox and run code that produces ~20 MB of stdout:
import { Sandbox } from '@e2b/code-interpreter';
const sandbox = await Sandbox.create();
// Generate ~22 MB of stdout
const execution = await sandbox.runCode(`print("x" * 22_000_000)`);
- Monitor the host Node.js process memory. The heap will spike to 1–1.5 GB and the event loop will stall for 10–20 seconds.
Root Cause
In js/src/utils.ts, readLines():
buffer += new TextDecoder().decode(value); // line 14
Each iteration creates a new string of size len(buffer) + len(chunk) while the old buffer is still referenced. For a 22 MB response arriving in ~1,400 chunks of ~16 KB:
- Total bytes copied:
Σ(i × 16KB) for i = 1..1400 ≈ 15.7 GB of string allocations
- Peak heap: 1.5 GB+ (V8 can't GC fast enough under allocation pressure)
- Event loop stalls: 10–20 seconds (GC pauses)
Evidence
We captured a V8 heap snapshot on a production worker after processing 22 MB of stdout. A single retained string — {"type":"stdout","text":"..."} — consumed 119,920 kB (117 MB, 37% of the heap). The retainer chain traces directly to the readLines async generator's parameters_and_registers (the buffer local variable), held through the ReadableStream reader → Promise chain → fetch Request body.
Measured Impact
| Metric |
Current (buffer +=) |
Fixed (array + join) |
| Peak heap |
211 MB |
20 MB |
| Peak RSS |
329 MB |
95 MB |
| Elapsed time |
29.8s |
4.1s |
| Memory amplification |
9x |
0.9x |
(Standalone benchmark with 22 MB stdout. Production workers with existing heap pressure show 80x amplification.)
Suggested Fix
Replace quadratic string concatenation with array-based buffering:
// js/src/utils.ts – readLines()
export async function* readLines(stream: ReadableStream<Uint8Array>): AsyncGenerator<string> {
const reader = stream.getReader()
const decoder = new TextDecoder()
const chunks: string[] = [] // ← array instead of string
let searchStart = 0
try {
while (true) {
const { done, value } = await reader.read()
if (value !== undefined) {
chunks.push(decoder.decode(value, { stream: true }))
}
if (done) {
const remaining = chunks.join('')
if (remaining.length > 0) {
yield remaining
}
break
}
// Check for newlines in accumulated data
const buffer = chunks.join('')
let newlineIdx: number
let start = 0
while ((newlineIdx = buffer.indexOf('\n', start)) !== -1) {
yield buffer.slice(start, newlineIdx)
start = newlineIdx + 1
}
// Keep only the remainder after the last newline
chunks.length = 0
if (start < buffer.length) {
chunks.push(buffer.slice(start))
}
}
} finally {
reader.releaseLock()
}
}
This changes the complexity from O(n²) to O(n) and eliminates the OOM risk for large outputs.
Happy to open a PR if this approach looks good.
Bug: O(n²) string concatenation in
readLinescauses OOM on large stdoutSummary
The
readLines()function in@e2b/code-interpreterusesbuffer += chunkto accumulate the HTTP response body. This is O(n²) string concatenation in JavaScript — each+=copies the entire existing buffer plus the new chunk into a fresh string. For large stdout outputs (>1 MB), this causes massive memory amplification and multi-second event loop stalls, leading to OOM kills on the host process.Environment
@e2b/code-interpreter: 2.3.3 (also confirmed on 2.4.0 — same code)Reproduction
Root Cause
In
js/src/utils.ts,readLines():Each iteration creates a new string of size
len(buffer) + len(chunk)while the old buffer is still referenced. For a 22 MB response arriving in ~1,400 chunks of ~16 KB:Σ(i × 16KB)for i = 1..1400 ≈ 15.7 GB of string allocationsEvidence
We captured a V8 heap snapshot on a production worker after processing 22 MB of stdout. A single retained string —
{"type":"stdout","text":"..."}— consumed 119,920 kB (117 MB, 37% of the heap). The retainer chain traces directly to thereadLinesasync generator'sparameters_and_registers(thebufferlocal variable), held through the ReadableStream reader → Promise chain → fetch Request body.Measured Impact
buffer +=)(Standalone benchmark with 22 MB stdout. Production workers with existing heap pressure show 80x amplification.)
Suggested Fix
Replace quadratic string concatenation with array-based buffering:
This changes the complexity from O(n²) to O(n) and eliminates the OOM risk for large outputs.
Happy to open a PR if this approach looks good.