Skip to content

Deeply nested input overflows the stack and aborts the process 🌀 #175

@timfennis

Description

@timfennis

Summary

Deeply nested source crashes the process with fatal runtime error: stack overflow (SIGABRT). This is an uncatchable abort, not a Rust panic, so it cannot be turned into a normal error at the point of failure.

Reproductions

Each of these aborts from a short script:

((((((((( ... )))))))))   // nested parentheses
[[[[[[[[[ ... ]]]]]]]]]   // nested list literals
{{{{{{{{{ ... }}}}}}}}}   // nested blocks
(((((1,) ... )))          // nested tuples
f(f(f( ... )))            // nested calls
----------- ... 1         // long unary-minus chain
not not not ... true      // long `not` chain
2^2^2^ ... ^2             // long right-associative `^` chain

The crash needs roughly 100–150 levels of nesting on a Linux 8 MB main-thread stack (far fewer on smaller stacks — see below).

Why it happens

Every layer of the pipeline walks the tree recursively, one native stack frame per nesting level:

  • ndc_parser — recursive-descent parser (single_expression and the self-recursive tight_unary / logic_not / exponent).
  • ndc_analyser, ndc_vm compiler, and the VM runtime — all recurse over the AST.

So the overflow is not confined to one crate; a tree deep enough to survive the parser can still overflow the analyser, compiler, or VM.

Measured first-overflow depth (nested list literals, debug build):

Stack First phase to overflow Approx. depth
8 MB (CLI main thread) full pipeline ~128
2 MB (cargo test worker thread) analyser/compiler ~20

Proposed direction

A single depth limit in the parser is an effective gatekeeper — if the parser rejects input deeper than N, no later phase ever sees a tree deeper than N, so one guard protects all four phases. That part is cheap.

The hard part is choosing N, because the safe depth scales with stack size, and the value that suits the 8 MB CLI main thread (~100) is unsafe on the 2 MB threads the test harness spawns. Tuning the limit to the smallest stack would make it uncomfortably restrictive for real code; tuning it to 8 MB makes the regression hard to validate in the test harness.

Preferred approach: run the parse → analyse → compile → run pipeline on a dedicated worker thread with a large, explicit stack (the same technique rustc uses), and pair that with a generous parser depth limit tuned to that stack. This makes every phase robust regardless of the caller's thread stack, gives both the CLI and the test harness a consistent stack size, and lets the depth limit be high enough not to bother real programs.

Implementation note: Interpreter is non-Send (it uses Rc), so the worker thread must create, use, and drop the interpreter entirely inside the spawned closure, returning only a Send result.

Alternatives considered

  • Parser depth limit only, tuned to 8 MB. Simple and protects the shipped CLI, but the regression test can't run on the 2 MB test threads without also raising the harness stack (RUST_MIN_STACK or a per-test big-stack thread).
  • Per-phase depth guards in analyser/compiler/VM in addition to the parser. Thorough but spreads the same logic across four crates; redundant once the parser gates depth.

Regression test

Whatever lands should include a test that feeds each nesting form above past the limit and asserts a clean error instead of an abort.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinginterpreterIssue relates to the interpreter

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions